BETA
This is a BETA experience. You may opt-out by clicking here
Edit Story

Big Data Detective: Can You Tell Good Data From Bad?

CenturyLink

In this age of big data, business leaders have access to unprecedented amounts of information about their customers, markets and industry. But how can they tell the good data—information that's accurate, relevant and free from bias—from the bad? Here are some tips for how to identify the best possible data to inform your business decisions.

Be A Good Steward

Many business leaders invest in data governance, which designates staff to ensure quality and credibility of data. Whether it's a chief data officer who leads an entire team, or a few dedicated employees, every organization needs at least one effective data steward to make sure data are being collected and handled properly.

Data collection conducted under stringent oversight enables business leaders to catch errors before that information is used to substantiate important decisions.

Data stewards can guard against misreading of data by emphasizing the importance of metadata—background information that defines the characteristics of data elements, said Manav Misra, chief science officer at CenturyLink Cognilytics.

“It's not just a one-time thing," he added. “Every time specific data is moved or touched, you have to ensure that the quality of the data is not compromised."

Scrutinize Data Vendors

It's one thing for an organization to rely on data that it has collected. But what about data generated by outside sources? Misra recommends applying the same level of scrutiny and control to both internal and external sources.

Ask third-party vendors if they can certify the data they're providing, he advised. Does the vendor have a data governance program in place? What about logs that record how data is modified? Can the vendor see how the data originated, what path it took to get to your organization, and who touched it along the way?

Don't be afraid to hold the bar as high for outside vendors as you do for your own business, he said.

Ask The Right Questions

Whether a business leader relies on internal or external sources of data, it's critical to ask the right questions about how the information was collected, Misra said.

There are many biases that can impact data sets. Confirmation bias occurs when the person who does the analysis selects data that support a certain hypothesis or belief, he said.

Selection bias can occur when data are skewed because of the way that it's collected, such as when a particular group is either excluded or over-represented in a survey.

To better understand how data are collected, business leaders should ask for details about the data model, or the framework used to organize the data. When a statistical model is too complex, it does a poor job of “generalizing," or predicting future outcomes, Misra added.

Watch For Red Flags

Even when business leaders do everything they can to ensure that their data are accurate and unbiased, errors can still occur. That's why it's important to watch for red flags, including fluctuations in new data that is collected, Misra advised.

If data results vary widely from one report to the next, it could indicate that something is wrong in either the underlying data or in the analysis that was performed, he said.

Business leaders should also lean on their business knowledge to ask questions, Misra added. That's what one credit card company business leader did when an analyst presented him with results that seemed to imply that the average customer had about 100 checking accounts. The business leader immediately disputed the conclusion and decided that the rest of the data was suspect as well.

Don't be shy about investigating the quality of the data presented, or the analysis that was done, when the information defies experience and common sense, Misra said.

While coming away with good data requires an investment in sound governance and management strategies, many organizations have reaped the benefits of doing so, Misra said. The bottom line: basic training in understanding what constitutes good data can turn hunch-driven businesses into data-driven businesses.