Getty Images/iStockphoto
Good data-driven decision-making avoids common pitfalls
Avoiding cognitive biases that can lead to the use of incomplete or inaccurate data is crucial to the successful use of analytics.
Data-driven decision-making is the future of business.
But data-driven decision-making has to avoid the many pitfalls that arise from cognitive bias in order to be successful, according to Mike Gualtieri, an analyst at Forrester Research.
Many organizations are providing employees with self-service analytics tools to fuel data-driven decision-making on the front lines of their businesses, while other enterprises are embedding dashboards and data models in the applications where people do their everyday work so that they can make decisions in real time.
It's those organizations that are set up to thrive, especially now amid the COVID-19 pandemic when economic conditions are volatile, while those still relying mostly on experience and gut instinct to make decisions are in a more perilous position.
The caveat, however, is that data has to be properly interpreted in order for analytics to be effective.
Organizations must therefore avoid cognitive biases and their resulting pitfalls that can turn a well-intended decision-making process into one that actually harms the organization rather than helps it.
Two common examples of cognitive bias are confirmation bias and recency bias, Gualtieri recently noted during MicroStrategy World 2022, the analytics vendor's virtual user conference. Confirmation bias is when someone has a belief and then specifically chooses data to support that belief while ignoring data that might contradict it; recency bias is giving more weight to recent data.
"Data is rich," Gualtieri said. "But it has to be interpreted the right way."
Beyond confirmation bias and recency bias, Gualtieri identified another set of common biases that can derail data-driven decision-making.
Fictitious corollaries
Chief among them is belief in a fictitious corollary.
Mike GualtieriAnalyst, Forrester Research
Fictitious corollaries occur when two data points mirror each other but aren't actually related -- yet because they mirror each other, an assumption is made that they're linked.
For example, Gualtieri showed that the per capita divorce rate in Maine from 2000 through 2009 was nearly identical to the per capita consumption of margarine in the state over that same period. A fictitious corollary would be to conclude that people who lived in Maine from 2000 through 2009 and consumed margarine were likely to get a divorce.
Another fictitious corollary is the relationship between the New York Yankees and University of Kentucky men's basketball team. Beginning in 1949, every year the Wildcats won the national title -- 1949, 1951, 1958, 1978, 1996, 1998 -- the Yankees won the World Series. The conclusion would seemingly be that if Kentucky wins the national title in the spring, bet on the Yankees to win the World Series in the fall.
But Kentucky men's basketball and the Yankees is a fictitious corollary. The performance of one has nothing to do with the performance of the other. And in 2012, the corollary -- in truth just a coincidence -- ended when Kentucky won the national championship and the San Francisco Giants won the World Series.
Correlations between seemingly disparate data sets, of course, do exist. Those who smoke and those who get lung cancer were not obviously connected, but researchers in the 1900s discovered and confirmed an overwhelming similarity and established a science-based corollary.
"We have to use correlation, but be healthily skeptical of it," Gualtieri said. "Understand that correlation does not necessarily equal causation."
The past does not predict the future
Another common bias that derails data-driven decision-making is an over-reliance on past performance to be an indicator of future performance.
When looking at a trend line on a chart, the natural cognitive bias is to assume it will keep going. And sometimes the past does indeed preview the future.
But not always.
So when using past performance to try to predict future performance, time windows are important. A five-day window isn't nearly as indicative of continuation as a five-month or five-year window.
Capital markets provide a good example, according to Gualtieri.
General Electric's stock is up slightly since the start of 2022, about six weeks, but over the past six years it has lost half its value. Conversely, Bitcoin is down from more than $64,000 per coin to about $47,000 per coin over the last three months, but over five years it's up from $1,062 per coin.
"You have to put [data] into context," Gualtieri said. "Drawing a conclusion from a trend over a particular period of time is a big mistake people can make. If you're making enterprise decisions, you need to look at [a broad] scope. Past performance is no guarantee of future results, so you have to do more analysis."
Data quality
Poor data quality is another factor that can lead to biases and bad decisions.
"'Garbage in, garbage out' still stands," Gualtieri said, referring to the longstanding tech adage GIGO.
Incomplete data, outdated data, bad data sources and query errors are ways that poor data quality can result in misinformed data-driven decision-making.
An example of the use of incomplete data occurred in 2016 when nearly everyone predicting the 2016 U.S. presidential election expected Hillary Clinton to win. Incomplete data, however, resulted in a total misreading of what was to come on Election Day.
What happened was over-reliance on polls biased by who was polled and when they were polled that predicted a Clinton victory when additional data sources might have revealed a different outcome.
"There was incomplete data," Gualtieri said. "[Prognosticators] didn't have the right data. There were some missing variables that people could not see. From an enterprise standpoint, the same thing can happen. You can have all kinds of data pointing in one direction, but if you don't have the complete set of data with all the missing variables, you're not going to get there."
Outdated data can come in the form of model drift, which refers to data models that decay over time because they haven't been maintained and retrained as new data becomes available. Meanwhile, the use of bad data sources can result from such things as blind faith in an author, and query errors are simply coding mistakes made when querying data that can lead to the use of the wrong data.
Overcoming pitfalls
Key to overcoming the cognitive biases that can derail data-driven decision-making is a strong data governance framework.
Data governance can address potential pitfalls, eliminating many instances of using bad data, limited past performance windows and fictitious corollaries.
"You can use business intelligence to actually overcome built-in cognitive biases," Gualtieri said.
Self-awareness, however, is also crucial. By being aware of the various cognitive biases that can arise, organizations can diagnose them as they creep in.
"We all have cognitive biases -- we're all susceptible to them," Gualtieri said. "A lot of the ways we can overcome them and find truth in data is by recognizing our cognitive biases."