Big Data is not a new term. It was first introduced in the U.S. in the 1980s. Big not only means that the volume is huge, but also means the amount of data that we can analyze and utilize grows exponentially.
According to Moore’s Law, the computer hardware processing speed and storage capability will double every one or two years.
This exploding volume growth of data is represented in three dimensions:
- The same type of data increases rapidly.
- The speed at which data is obtained.
- The variety of data continues to grow, including the data source and the data type.
How to collect, store, maintain, manage, analyze and share the Big Data remains a challenge to the world.
Ramayya Krishnan mentioned that Big Data has the potential to fundamentally transform society. However, unlocking this potential will require careful attention to data governance and insightful application of data analytics combined with an environment that spurs managerial innovation.
From a historical perspective, to accept a new trend and to take advantage of it, takes courage, persistence and improvement/integration of legislation, technology and talent.
Let’s first take a look at how the U.S. government prepared itself to adapt to the trend. The Freedom of Information Act (FOIA) took effect after the 12-year arduous effort of the Father of FOIA John Moss and the ice breaker Donald Rumsfeld.
When the concept of internet started to become a popular term in 1988, Mark Weiser a scientist from Xerox first defined Ubiquitous Computing. It basically means that for everything that exists in the world, it can be connected to everything else. Every connected thing can be calculated. This can be achieved by connecting small computing devices. This way the data can be collected anywhere, anytime. Eventually computers will integrate with the environment as a whole.
Another important concept is Business Intelligence which is defined as “ to collect, store, analyze and share data.” In 1992, Bill Inmon, the father of the Data Warehouse defined it as a subject oriented, integrated, non-volatile and time variant data storage to support decision making. In 1993, Edgar Codd made contributions to Online Analytical Processing. In my opinion, one of the most important concepts is visual explanations – the visual display of quantitative information. Since the data has become more and more complex, people can hardly understand what it means. Visual explanation will help ordinary people to understand the trend behind the data. Design is pivotal for data visualization. Edward Tufte said that there is no such thing as information overload, just bad design. If something is cluttered and/or confusing, fix your design.
The U.S. has been able to advance in the software industry for many years. In 1959, the U.S. National Defense department and IBM jointly developed a project – Semi-Automatic Ground Environment. This project was dedicated to collect, process signals from radars to monitor activities by various aircraft. It lasted 30 years and cost approximately $10 billion (U.S.). Together with other similar sized large projects, they trained many data analysts, developers and architects. They were called the West Point training projects. Those talented people then joined the private sectors and have developed all kinds of software which made U.S. remain at the top in the global software industry.
The Impact of Big Data to Individuals, Organizations and Government
Individual Data Privacy
The U.S. Privacy Study Commission (1977), mentioned that the real danger to individuals is the gradual erosion of individual liberties through automation, integration, and interconnection of many small, separate record-keeping systems, each of which alone may seem innocuous, even benevolent, and wholly justifiable.
In 1948, George Orwell, in his book titled Nineteen Eighty Four, described a situation where you had no place to hide whenever you were sleeping or awake, working or eating, indoor or outdoor, in the bed or in the bathtub. Except what’s inside your brain, nothing belonged to you. There was a Big Brother who took control of things. Today, the Big Brother could be the central data bank. The impact of data integration from different source is immense.
Organization – Data Driven Decision Making
In 1989, Howard Dresner first defined Business Intelligence that is a fact based business decision making method.
Wal Mart was the first organization that applied data mining techniques. The famous case study was the successful campaign of diaper and beer promotional package sale. According to the data analysis, Wal Mart found out that 30% to 40% new fathers tend to buy beers as a bonus for themselves when they purchase diapers. The cross sales campaign was a huge success and data driven decision making has been widely utilized thereafter in the business world.
Government – Increase Efficiency and Effectiveness
The U.S. government has also benefited from Big Data analysis. Jack Maple, created CompStat (Computer Statistics) methodology to analyze the reasons and trends of crimes in New York City which has proven effective and efficient to stop crimes and better utilize police force.
The collection, storage, management, analysis and sharing of Big Data will continue to evolve as technologies and techniques using Big Data develop and as the economic benefits of the use of data continue to grow.
Source: “Big Data”, Tu Zipei