Tuesday, January 22, 2013

Three Big "Q"s in Big Data

There’s no “perfect” data in Big Data world, the accuracy and compromise will continue to coexist across the span of information management

The Big Data battle is still on, most of organizations haven’t seen clear symbol to declare victory yet, though everyone is talking about “V” words these days: Volume, Velocity, Variety, and look for big Value; if switch the angle a bit, how to see Big Data in the lens of “Q”?

1.    Quantity

The growth of Data is exponential: The volume of business data worldwide is expected to double every 1.2 years. Data is growing at 40% compound annual rate, reaching nearly 45 ZB by 2020. Every day, 2.5quintillion bytes of data are created, with 90% of the world’s data created in the past two years, also, data production will be 44 times greater in 2020 than in 2009.

But what exactly is the difference between “a lot” of data and “big data”?  information becomes big data when the volume can no longer be managed with normal database tools. Due to the mountain of information that companies are producing and spreading through social networks, businesses are now facing the challenge of processing all this data in a short time

Perhaps Big Data might not necessarily be that 'big'. Not all big data is new data. a wealth of data generated sits unused or at least not used effectively. To process: Start with a small data sample. You don’t need the full width and depth of your data to find interesting stuff. Start small. It saves you a lot of technology headache at the start. The sample is important. Does it represent the target of your business problem? With Big Data you can sample the part of the population that you are interested in and still have a sample large enough to do meaningful work. That is the advantage of Big Data for analysis. In addition, Big Data processes large diverse data sets to reveal complex relationships, so humans are still crucial ingredients for interpreting the data into insight.

2.    Quality

There’s no “perfect” data in Big Data world, the accuracy and compromise will continue to coexist across the span of information management. And  Big Data Quality efforts need to be defined more as profiling and standards versus cleansing. This is better aligned to how big data is managed and processed.

Data expert also provides a better definition of data quality as “the extent to which the data actually represents what it purports to represent.” As Data Quality is multiple dimensional concept:
  • Objective Data Quality Dimensions: Integrity, Accuracy, Validity, Completeness, Consistency, Existence
  • Subjective Data Quality Dimensions: Understandability, objectivity, timeliness, relevance, interpretability, trust
 “Good enough” data can be more useful than perfect data, as long as the information is good enough for the recipient to make sound business decisions or solving specific business problems via the best angles, because it takes longer to make the data more accurate, and such time delay may actually diminishes its value rather than improving it. How to leverage data quality and cost/benefit analysis is important. 

3.    Quantum Leap

Big Data promised to be transformative, not just transactional: Across industries, Big Data can produce large data set coupled with massive processing capabilities to spur growth and reveal cost-efficiency opportunities. Almost all are immersed in a transformation that leverages analytics and big data.

The greatest rewards go to those with clear vision for how it can transform their organization, capabilities and industry. According to industry survey, more than 90% of Fortune 500 companies will have at least one big data initiatives, the effective use of data can deliver substantial top and bottom-line benefits, Building business capabilities through it will not only improve performance in traditional segments and functions, but also create opportunities to expand product and service offerings and create new business model.  

Real World Data Puzzles:  If every passenger poses 100 searches before buying a air ticket (a number in line with actual behavior) and each search looks at 1,000 flights, then the airlines would need to answer 15,000,000 questions per second. Neither their networks nor their computers can handle this.
Why is this important? Because data makes all the difference in the world in a time-critical environment, so keep business questions in mind such as:
 “What are the problems we need to solve? What is the insight? How can the data provide guidance to make the best decision” etc.

Some even think Big Data in 21st century is equivalent to the industrial revolution in 20th century, As organizations are increasingly experimenting to capture big data’s potential for both short and long term advantage, such as build out the capabilities necessary to capitalize on big data potential, or perhaps embrace the creative destruction of business model. 

Such three “Q” words for Big Data is accompanies with Questioning: to capture insight from Big Data, and pursue quantum leap in business transformation. 


Post a Comment