BETA
This is a BETA experience. You may opt-out by clicking here
Edit Story

Governance: The Big Data Elephant in the Room

SAP

English: Elephant in the room (Photo credit: Wikipedia)

By Dan Everett, Senior Director, EIM Solution Marketing, SAP

Most of the conversations I hear and articles I have seen about big data are focused on business analysis and innovation. Know more about your customers and the products and services they are looking for and you can positively impact revenue, margin and market share. These are great reasons to head down the big data path.

However, to realize business value from big data companies need to have strong information governance and few people seem to be talking about this.

To enable analysis and innovation opportunities with big data, companies need to integrate information from multiple data sources. And this data will included both structured and unstructured data which means text processing and entity extraction capabilities will need to be part of the data integration infrastructure. It will also be important for companies to look for data integration solutions that have already done the integrations and optimizations to Hadoop and Map-Reduce so IT does not have to be experts in these areas.

Incompatible standards and formats of data in different sources can prevent the integration of data and the more sophisticated analytics that create value from big data. According to the Aberdeen Group (Data Management for BI: Fueling the analytical engine with high-octane information)  Best-in-Class companies take 12 days on average to integrate new data sources into their analytical systems; Industry average companies 60 days; and Laggards 143 days. And part of the large gap between best and worse is due to differences in the use of data quality technology and processes. So companies will need data profiling, cleansing, and meta/master data management capabilities to realize the full business value of big data.

While it might seem like keeping every bit of data is the best approach the cost makes it impractical. According to a McKinsey Global Institute study (Big data: The next frontier for innovation, competition and productivity) the projected growth in global data generated per year is 40%. So companies will need to balance long term retention with cost and legal requirements. This means companies are going to have to determine what primary data needs to be kept live, what can be moved to an online archive and what their policies are around retention and disposal.

Yes big data is a big business opportunity, but the business value won’t be realized if the information isn’t governed. How can you use big data to develop the next generation of products and services if you can’t do text processing to perform market/brand/sentiment analysis? How can you use big data to build predictive models if can’t cleanse data and reconcile formats and meaning between your data and external data? How can you improve your bottom line if the additional revenue you are generating from big data is being eaten up by your IT capital and operating expenditures to store and manage data?

Information governance in the context of big data is a topic that companies can’t afford to ignore. It can no longer be the elephant in the big data room no one is talking about. There are companies like Bio-Rad, Colgate-Palmolive and Sysco Foods who have realized how corporate decision making is impacted by both the volume and quality of their data, and consequently implemented information governance. You can hear these companies talk about their experiences and practices at SAPPHIRE NOW Online.