Why data literacy skills still matter with augmented analytics
Democratizing data analytics gives everyone access to tools and information, but data literacy is still required for analyzing data and delivering successful outcomes.
Augmented analytics promises to bring BI to a much larger audience of business users. Early implementations could prove useful in answering simple questions, like how much inventory an organization should plan to stock. A higher level of data literacy skills is still required for more complex types of analysis, however.
For example, when Gartner invited vendors to apply their augmented analytics tools against a sample data set at a BI Bake Off in 2016, only one -- Salesforce Einstein -- allowed business users to accurately identify the root driver. In this case, the tools were presented with a data set relating to college students to see what factors lead to higher long-term earnings. Most of the tools simply reinforced the inaccurate bias that earnings were correlated with Ivy League colleges, when the main driver was parents' income.
Augmented analytics greatly reduces the need for data literacy in order to extract insights, but does not obviate the need for learning how analytics can be misleading.
"We will always need expert data scientists to guide and verify the logic applied by augmented analytics to ensure that systems using them are well-behaved," said Micha Breakstone, co-founder and head of R&D at Chorus.ai, a conversational analytics service in San Francisco. This applies to making solid logical inferences and ensuring the applications in the real world are in line with the values of the people whose abilities they were designed to augment.
Enhancing users' data literacy skills may take time, argued Mark Palmer, general manager of analytics at TIBCO Software, based in Palo Alto, Calif. But, in the meantime, augmented analytics will reduce the burden on subject-matter experts to understand the data model used to display trend lines on a graph. These experts just need to use their domain expertise to interpret what the lines mean. In this sense, augmented analytics could use visualizations and embedded AI models to remove the barrier between complicated data and business insights.
Different types of literacy
Micha Breakstoneco-founder, Chorus.ai
Cassie Kozyrkov, chief decision scientist at Google, has suggested that enterprises could benefit from thinking about nine different roles on their data science teams with different kinds of data literacy skills. Basic literacy involves simply learning how to use the technical features of tools to draw pretty graphs. Augmented analytics tools help reduce the technical know-how for generating basic charts.
More sophisticated kinds of data literacy involve understanding how to set up a business question in a way that can be solved by data science. In this context, augmented analytics can reduce the effort spent by data scientists in setting up and exploring different data models that might be used by business domain experts, said Ryohei Fujimaki, CEO and founder of DotData, a data science automation platform in Cupertino, Calif.
Organizations could explore a hybrid approach in which business analysts and data analysts tackle different tasks, with analysts and BI engineers building data models and data scientists vetting them. This would free up data scientists to spend more time applying their data literacy skills to high-value use cases for the organization, Fujimaki said.
In order to reduce the data literacy requirements, analytics managers should look for ways to ensure interpretability and transparency of the outcomes generated by augmented analytics tools. Many processes are automated in augmented analytics tools, so it is critical to produce transparent explanations that business users can easily understand, verify and act on.
This transparency and interpretability can help employees with differing data literacy skill levels understand outcomes and adopt them to achieve the maximum business results. With augmented analytics, business and data science will get closer, and business-oriented data scientists will become increasingly important, as well as "traditional" data scientists, Fujimaki explained.
One promising idea is to create a semantic abstraction tier that sits between enterprise data and the analytics platform.
"To ensure the insights are creditable, machine learning must be connected and understand enterprise data structures, data quality, integration and business semantics," said Nic Smith, global vice president of product marketing for cloud analytics at SAP. This kind of approach will make it easier for business experts and data scientists with more data literacy to work together more efficiently.