Monday, April 13, 2015

Big Data vs Data Warehousing

The key is to focus on the convergence of analytics as a whole to gather a better and wider perception of reality and to make better decisions.
Big Data" refers to a problem which deals with large volume, high variety, high velocity and high variability of data. Traditional relational databases struggles with such problems. Data Warehouses have been more about data management component with consistent access. Big Data has been more about how to ad hoc utilize data for discovering business insight.

Big Data is Data Warehousing, but it offers much more than what you can gather from a typical data warehouse.You can leverage the computational power of big-data using various tools and framework that make it much more than a typical data warehouse. Big Data can be considered to be a set of technologies that are used to develop a data focused infrastructure targeted at solving business problems for an organization. A data warehouse is simply a component of the Big Data infrastructure. It also depends how you define a data warehouse, logically vs. physically. Logically, you may choose to use Hadoop as your staging area, depending on your needs and the complexity of data types needs to be analyzed and processed to make business decisions.

Big Data and Data Warehouse have different focus: Big Data is not just about volume, but more about velocity, variety, and veracity, which is almost non-existent in Data Warehouse world due to technological constraint and cost. Big Data is collection of large data in a particular manner, data-warehouse collects data from different department of a organization, and it requires efficient management technique. Conceptually these are same only at one factor that they collect large amount of information. In those practical respects, it could be argued that big data offers a superset of data warehousing technology; does this count as the beginning of evolution?
Traditional Data Warehouse:  
-focuses on using consistent repeating set of metrics and methodologies to both measure past performance based on historical data and devise new business strategy
- more static, descriptive and explanatory in nature
-mainly used for reporting, data mining and OLAP like analysis
But Big Data Analytics
-helps develop new strategic insights (not just reports) based on new and consistently changing and streaming data
-very dynamic, predictive and prescriptive in nature.
- mainly used for text/sentiment analytic, machine learning and adaptive modeling.

Big Data is not data warehousing. Ultimately it will be the direct flow of information into processors and algorithms that track/analyze/output desired information. So, Big Data is really all about surpassing the status-quo in data science and moving toward a more fluid and dynamic real-time data environment, where data is useful almost instantaneously. Consequently, rather than separating data management by the technical constraints of Data Warehousing or Big Data, we should focus on the convergence of analytics as a whole to gather a better and wider perception of reality and to make better decisions.


Post a Comment