Data Governance should be seen as a good habit, not a software package or an old solution.
Data Governance provides the rules that the data should follow. Data Governance, for the most part, relates to the maintenance of quality data. Just because data will fit into a certain column in a database doesn't mean that it is, therefore, good data or have good data quality. People are seeing governance as a synonym for quality; the latter can certainly be viewed as a dimension of the former, but all too often seems to dominate the business case. Data Governance really comes down to making a proactive decision about what data is needed, what it means to the enterprise and how to understand the quality of data (Big Data, Master Data, Reference Data, Transaction Data, etc.). Also how to improve the data quality where needed.
Data Governance is essential in order to provide meaningful and insightful reporting or business intelligence. Think about it, you will not be able to understand the performance of your company or measure it without good quality data, and data governance ensures that you can get your hands on it. Data governance is at its heart is business intelligence practice - without the effective business analysis of data there's no hope of understanding data -from the business perspective. BI applications are pretty close to useless without Data Governance in place and enforced. Data Governance leads to excellent Data Quality which in turns leads to good Business Intelligence. Typically BI without DQ is useless because you're not basing your decisions on good quality information. The stronger link has to be drawn between the success of Business Intelligence and the need for Data Governance - and vice versa.
Both big data and small data suffer from low quality. But with big data, cleansing can be more arduous or, in most cases, infeasible. But is 100% clean data necessary? That depends on the use of the data. For example, when analyzing human capital data, and even though there is high variance in the domain by its very nature, with large enough data sets and as much cleansing as is practical, you can ignore or let the statistical algorithms handle the dirt yet still derive knowledge from the data. The great irony is that the quality and speed of achieving data governance is inescapably and directly connected to the quality of the analytics tool in use. Analytics is only as good as your data quality...and typically during ETL, the IT team is not going to know what makes up good quality data unless there are rules (Data Governance) established that apply to all the data and unless there is business led involvement to determine its accuracy.
"There's no such thing as bad data, only some that's misunderstood."All data, from wherever it comes, is legitimate and reflective of the systems that provide it. As such, the data reveals deep and essential truths about not only the business domain it covers but also about the systems that capture it. (The practice of transforming data into a different form necessarily eliminates some of these truths.). Every malformed phone number, every mistyped address, every miscoded part number is an opportunity to identify and correct the root cause and improve the information universe, and there are tremendous benefit and value there. From the other side, it is important how data reflect a reality. What to consider as information and noise. Will fixing wrong addresses increase potential profit. Or what percent of wrong addresses to expect and if cleaning improves your profit. In addition, what is good data from an IT perspective (making sure that the data types (int, char, vchar, bit) match and that the data is stored properly is far different than what is good quality from a business perspective (data type (costs, sells, customer addresses, etc)).
Data governance is critical especially as organizations move to the cloud using SaaS model; data is an asset that needs to be protected and used properly. Data Governance should be seen as a good habit, not a software package or an old solution. The end result is their "Big Data" will be viewed as accurate and trustworthy. Further, proper data governance is not a one-time exercise but a constant review.