Getty Images

Feature

Infusion of generative AI into analytics a work in progress

Many vendors have introduced tools that integrate LLM technology with their platforms, but most are still refining the tools to ensure their accuracy and security.

Eric Avidon, Senior News Writer

Published: 18 Dec 2023

The integration between generative AI and analytics remains under development.

Many vendors have unveiled plans to enable customers to query and analyze data using conversational language rather than code or business-specific natural language processing (NLP) capabilities.

In addition, many have also introduced AI assistants that users can ask for help while executing various tasks and tools that automatically summarize and explain data products such as reports, dashboards and models.

Some have also introduced SQL generation features that reduce the coding requirements needed to model data and automated tools that offer suggestions as developers build data products.

Sisense was perhaps the first analytics vendor to reveal plans to integrate its platform with generative AI (GenAI) capabilities, introducing an integration with OpenAI -- developer of ChatGPT -- in January 2023. Two months later, ThoughtSpot unveiled Sage, a tool that combined the vendor's existing natural language search capabilities with large language model (LLM) capabilities to enable conversational language interactions with data.

By summer, Tableau and Qlik were among the vendors that had introduced generative AI plans. In addition, tech giants AWS, Google and Microsoft -- developers of analytics platforms QuickSight, Looker and Power BI, respectively -- were all working to add generative AI to their BI tools.

But as the end of 2023 nears, most analytics vendors' generative AI capabilities are still in some stage of development and have not yet been made generally available.

There are exceptions.

For example, MicroStrategy in October made NLP and text-to-code translation capabilities generally available. Similarly, Domo in August released NLP and AI model management capabilities as part of its Domo AI suite.

Most others, however, are still being refined, according to David Menninger, an analyst at Ventana Research.

"There are some that are available today, but the majority are still in preview," he said.

Perhaps the main reason for the holdup is that it's difficult to take a new technology and make it one of the most significant parts of a platform.

It takes time to get it right, and vendors are attempting to get it right before they release tools to the public, according to Sumeet Arora, ThoughtSpot's chief development officer. Before releasing tools, vendors need to make sure responses to natural language queries are accurate and the data organizations load into analytics tools that are integrating with LLMs remains private and secure.

"The most difficult technical problem is how to leverage GenAI to answer natural language questions with 100% accuracy in the enterprise. That is not a straightforward problem," Arora said.

He noted that OpenAI's GPT-4 and Google's Gemini answer questions with just under 80% accuracy.

"That is not good enough in analytics," Arora said. "The question is how to get to 100% accuracy in analytics. That has been the journey of the last year."

Top benefits of generative AI for businesses. — Seven benefits of generative AI for the enterprise.

The promise

There are two big reasons so many vendors have made generative AI a focal point of product development.

One is the potential to expand BI use within organizations beyond a small group of highly trained users. The other is the possibility of making existing data experts more efficient. Both come down the simplification of previously complex processes, according to Francois Ajenstat, now the chief product officer at digital analytics platform vendor Amplitude after 13 years at Tableau.

"GenAI really drives radical simplicity in the user experience," he said. "It will open up analytics to a wider range of people and it will enable better analysis by doing a lot of the hard work on their behalf so they can get to the insights faster and focus on what matters."

Analytics use has been stuck for more than a decade, according to studies.

Because BI platforms are complicated, requiring coding skills for many tasks and necessitating data literacy training even for low-code/no-code tools aimed at self-service users, only about a quarter of employees within organizations use analytics as a regular part of their job.

Generative AI can change that by enabling true conversational interactions with data.

Many vendors developed their own NLP capabilities in recent years. But those NLP tools had limited vocabularies. They required highly specific business phrasing to understand queries and generate relevant responses.

The generative AI platforms developed by OpenAI, Google, Hugging Face and others have LLMs trained with extensive vocabularies and eliminated at least some of the data literacy training required by previous NLP tools.

Meanwhile, by reducing the need to code, generative AI tools can make trained data workers more efficient.

Developing data pipelines that feed data products takes copious amounts of time-consuming coding. Text-to-code translation capabilities enable data workers to use conversational language to write commands that get translated to code. They greatly reduce time-consuming tasks and free data workers to do other things.

MicroStrategy and AWS are among the vendors that have introduced generative AI tools that enable natural language query and analysis. They also released capabilities that automatically summarize and explain data.

In addition, Qlik with Staige and Domo with Domo AI -- along with a spate of data management vendors -- are among those that have gone further and introduced text-to-code translation capabilities.

Ultimately, one of the key reasons analytics vendors are developing so many generative AI tools is the promise of improved communication, according to Donald Farmer, founder and principal of TreeHive Strategy.

"The most obvious [benefit of generative AI] is that it has an ability to communicate what it finds," Farmer said. "Ninety percent of the problem of analytics is explaining your answers to people or getting people to understand what has been discovered."

Two more benefits -- perhaps geared more toward developers and data engineers -- are its intuitiveness related to data integration and its ability to generate test code to help develop algorithms, Farmer added.

With respect to the generative AI capabilities themselves, some vendors are experimenting a bit more radically than others.

NLP, AI assistants and text-to-code translation capabilities are the most common features that vendors have introduced to simplify data preparation and analysis. Others, however, represent the cutting edge.

Menninger cited Tableau Pulse, a tool that learns Tableau users' behavior to automatically surface relevant insights, as something beyond what most vendors so far have publicly revealed. In addition, he noted that some vendors are working on features that automate tasks such as metadata creation and data cataloging that otherwise take significant time and manual effort.

"The cutting edge is metadata awareness and creation, building a semantic model with little or no intervention by the user," Menninger said. "Take away NLP and the other great value of GenAI is automation. It can automate various steps that have been obstacles to analytics success in the past."

Farmer, meanwhile, named Microsoft Copilot, Amazon Q from AWS and Duet AI from Google as the comprising the current cutting edge.

Unlike analytics specialists whose tools deal only with analyzing data, the tech giants have the advantage of managing an entire data ecosystem and thus gaining deep understanding of a particular business. Their generative AI tools, therefore, are being integrated not only with BI tools but also data management, supply chain management, customer service and other tools.

"The stuff that looks most interesting is the copilot stuff -- the idea of AI as something that is there in everything you do and absolutely pervasive," Farmer said. "It's only the big platform people that can do that."

The delay

Often, after tools are unveiled in preview, it takes only a few months for vendors to make whatever alterations are needed and release the tools to the public.

That hasn't been the case with generative AI capabilities.

For example, Sage was introduced by ThoughtSpot nine months ago and is not yet generally available. It was in private preview for a couple of months and then moved to public preview. But the vendor is still working to make sure it's enterprise-ready.

Similarly, Tableau unveiled Tableau GPT and Tableau Pulse in May, but both are still in preview. The same is true of Microsoft's Copilot in Power BI, Google's Duet AI in Looker and Spotfire's Copilot, among many other generative AI tools.

At the core of ensuring generative AI tools are enterprise ready are accuracy, data privacy and data security, as noted by Arora.

AI hallucinations -- incorrect responses -- have been an ongoing problem for LLMs. In addition, their security is suspect, and they have been susceptible to data breaches.

Reducing incorrect responses takes training, which is what vendors are now doing, according to Arora.

He noted that by working with customers using Sage in preview, ThoughtSpot has been able to improve Sage's accuracy to over 95% by combining generative AI with human training.

"What we have figured out is that it takes a human-plus-AI approach to get to accuracy," Arora said. "ThoughtSpot is automatically learning the business language of the organization. But we have made sure that there is human input into how the business language is being interpreted by the data."

A formula that seems to result in the highest level of accuracy from Sage -- up to 98%, according to Arora -- is to first roll the tool out to power users within an organization for a few weeks. Those power users are able to train Sage to some degree so the tool begins to understand the business.

Then, after those few weeks, Sage's use can be expanded to more users.

The most difficult technical problem is how to leverage GenAI to answer natural language questions with 100% accuracy in the enterprise. That is not a straightforward problem.

Sumeet AroraChief development officer, ThoughtSpot

"There is no easy button for GenAI," Arora said. "But there is a flow that if you follow, you can get amazing results. Once the system is properly used by the power users and data analysts, it becomes ready for prime time."

But there's more to the lag time between introducing generative AI analytics capabilities and their general availability than just concerns and risks related to accuracy, security and privacy, according to Ajenstat.

Generative AI represents a complete shift for analytics vendors.

It has been less than 13 months since OpenAI released ChatGPT, a significant improvement in generative AI and LLM capabilities.

Before then, some analytics vendors offered traditional AI and machine learning capabilities but not generative AI. Once ChatGPT was released, the ways it could make data workers more efficient while enabling more people within organizations to use analytics tools were clear.

But truly integrating the technologies -- ensuring accurate natural language query responses, training chatbots to be able to assist as customers use various tools, and putting governance measures in place to guarantee data privacy and security -- takes time.

"The speed of adoption of ChatGPT took the technology industry by storm," Ajenstat said. "We realized this is a tectonic plate shifting in the landscape where everyone will lean into it. As a result of that, we're at the peak of the hype cycle. That's exciting. It also means there are a lot of ideas."

Getting the ideas from the planning stage to the production stage, however, is not a simple, quick process, he continued.

In particular, analytics vendors need to make sure users can trust their generative AI tools.

"We see the potential, but can users trust the results?" Ajenstat said. "There's also trust in terms of the training data -- sending it to [a third party]. There's trust in terms of bias and ethics. There's a lot below the surface that technology providers and the industry as a whole have to figure out to make sure we're delivering great products that actually solve customers' problems."

Beyond the difficulty related to getting tools ready for enterprise-level consumption, the speed of generative AI innovation is delaying the release of some capabilities, according to Farmer.

He noted that once a feature is made generally available, a vendor is making a commitment to that feature. They're committing to improve it with updates, offer support to users and so forth. But because generative AI is now evolving so quickly, vendors are unsure whether some of the capabilities they've revealed in preview are the ones they want to commit to long term.

"If you come out with an enterprise product, you're committed to supporting it over the lifetime of an enterprise product," Farmer said. "It's really difficult to say something is stable enough and complete enough and supportable enough to build an enterprise agreement around it."

The future

The natural language query capabilities, AI assistants and summarization tools that many analytics vendors are developing -- and a few have made generally available -- are the first generation of generative AI capabilities.

At a certain point, perhaps during the first half of 2024, most will be ready for widespread use. At that point, vendors will turn their attention to a different set of generative AI features, which will represent a second generation.

Process automation might be a primary theme of that second generation after simplification was the primary theme of the first generation, according to Ajenstat.

"It will be interesting to see how long we stay in the first generation because we haven't actually gotten the mass adoption there," he said. "But for me, the next phase is going to be about automation. It will be using generative AI as an agent that can automate tasks on your behalf, augmenting the human by removing some of the drudgery that's out there."

Arora likewise predicted that task automation will mark the next generation of generative AI, eventually followed by systems themselves becoming autonomous.

Many of the tasks needed to inform natural language queries such as data curation and defining metrics still require manual effort, he noted. But automation of those data preparation tasks is coming.

As for when those automation capabilities will be in production, Arora predicted it could be as early as the second half of 2024 and early 2025.

"Users will be able to connect to a data system and then start asking questions," Arora said. "The system will automatically define metrics, automatically find the right data tables and answer questions with accuracy. There will be complete automation."

Similarly, Menninger cited automation as a likely centerpiece of the next phase of generative AI development. However, he said automation will go beyond merely reducing the drudgery of data preparation.

Menninger expects that the first-generation tools now under development will be available by the end of the first half of next year. Vendors will then not simply turn their attention to automation but turn their attention to the automation of AI.

Generative AI is not the same as AI, Menninger noted. It is not predictive analytics. It still takes the rare expertise of data scientists and PhDs to train data and develop AI models that can do predictive analysis.

However, generative AI can eventually be trained to create AI models.

"In the same way GenAI can now generate SQL code, we're going to get to a point where GenAI can generate predictive models that are of a high enough quality that we can rely on them," Menninger said. "And if they're not of a high enough quality, they'll be close enough that we can at least have more productivity around creating AI models."

He added that research from ISG -- now Ventana's parent company -- shows that the number one problem organizations have developing AI models is a lack of skills.

"To the extent we can use GenAI to overcome that skills gap, that's what the future is about."

Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.

Infusion of generative AI into analytics a work in progress

Many vendors have introduced tools that integrate LLM technology with their platforms, but most are still refining the tools to ensure their accuracy and security.

The promise

The delay

The future

Dig Deeper on Business intelligence technology

Securing agentic identities focus of Palo Alto’s CyberArk buy

Maximising agentic AI’s impact: the importance of a strong data foundation

ThoughtSpot adds data preparation with Analyst Studio launch

ThoughtSpot AI agent Spotter enables conversational BI

The promise

The delay

The future

Related Resources

Dig Deeper on Business intelligence technology

Securing agentic identities focus of Palo Alto’s CyberArk buy

Maximising agentic AI’s impact: the importance of a strong data foundation

ThoughtSpot adds data preparation with Analyst Studio launch

ThoughtSpot AI agent Spotter enables conversational BI