Rawpixel.com - stock.adobe.com
Generative AI has been the dominant analytics trend to date in 2023. But the first half of the year marked just the beginning of how generative AI will be used to transform data analysis.
While many vendors have unveiled their plans for integrating generative AI throughout their platforms, few capabilities have moved beyond the preview stage.
There's been plenty of theoretical talk about how generative AI can revolutionize analytics by making it accessible to more than just data experts and easing burdens on those tasked with overseeing their organizations' analytics operations. But few tools have yet been brought to market.
Those that have been introduced are largely still under wraps.
"The generative AI announcements in the analytics/BI market to date are mostly still in development or private preview," said Doug Henschen, an analyst at Constellation Research. "Overall, I'd say we're beyond the idea/theory stage. But we're short of seeing market-proven gains in productivity and insight."
But generative AI hasn't been the only analytics trend so far in 2023.
The cost of cloud computing has become a problem as organizations migrate more of their data and analytics operations to the cloud, so finding ways to keep those costs under control is a rising trend. In addition, emphasizing data quality is becoming more critical as data volume and data complexity increase.
The state of generative AI
Generative AI has become the dominant trend in analytics so far in 2023 because it has the potential to transform the way organizations work with data.
Data is, and long has been, largely the domain of a small group of experts within an organization.
Data is complicated, and analysis beyond just looking at basic figures takes training. Training, however, is time consuming and can be expensive. Therefore, many organizations lack data literacy.
Meanwhile, analytics platforms are complex and most require code to query and manipulate data. Even those aimed at self-service users that include low-code/no-code capabilities and augmented intelligence features, such as natural language processing, require some level of expertise.
As a result, even as technology has advanced to include no-code capabilities and understand some commands and queries in natural language, the number of data users within organizations has remained static at about a quarter of all employees for decades, according to numerous studies, including one in 2022 by Eckerson Group.
If executed properly, generative AI can change that.
Large language models (LLMs) have expansive vocabularies that understand freeform natural language rather than just specific business terms and commands. In addition, their automation capabilities can translate written words to code that computers can understand and then translate code back to natural language that any business user can understand.
Therefore, generative AI has the potential to both enable any business user to work with data and make data experts more efficient by reducing the amount of code they need to write to perform their jobs.
But midway into 2023, generative AI in analytics is still in the potential stage rather than the production stage, according to industry people.
"The current hype is resulting in a lot of … product announcements. But when you look beyond the marketing and examine the fine print, much of it is not yet available to customers," said James Fisher, chief strategy officer at Qlik. "That said, things are moving quickly, and customers are eager to explore the possibilities."
Following OpenAI's launch of ChatGPT in November 2023, which represented a significant leap in LLM capabilities, many analytics vendors have introduced plans to incorporate generative AI into their platforms.
Sisense was among the first, unveiling an integration with ChatGPT in January 2023 so it can develop tools fueled by generative AI. Since then -- among others -- Amazon QuickSight, Microsoft Power BI, Qlik, Tableau and ThoughtSpot have unveiled plans for adding generative AI.
In addition, some BI users have found ways to incorporate generative AI while they await generally available tools from vendors.
For example, Fisher noted that Qlik customer Harman International, a global car and audio and lighting technology company, built an application with ChatGPT and Qlik that uses natural language to drive insights with Qlik's analytics engine. Meanwhile, some ThoughtSpot customers are using Sage -- the vendor's generative AI-fueled search platform -- in customer preview and seeing the benefits, according to Cindi Howson, the vendor's chief data strategy officer.
Henschen added that many enterprises are in their own testing phase, figuring out what they want from generative AI and which vendor's tools might best fit their needs.
"I'm seeing a mix of excitement and caution among the [executives] we talk to," he said. "Everybody is working on a strategy and is kicking the tires of various vendor offerings. Initial tests tend to focus on internal-facing and developer capabilities. There's a great deal of interest in developing customized models that are specific to and secured for exclusive use by the organization."
Trends within the trend
While generative AI, in general, has been the dominant analytics trend to date in 2023, trends are percolating within the overall trend.
Vendors are taking two approaches to incorporating generative AI, according to Donald Farmer, founder and principal of TreeHive Strategy.
Some are integrating with LLMs such as ChatGPT and Google Bard and using the LLMs as natural query interfaces and avenues for creating data narratives.
For example, Sisense -- perhaps the first analytics vendor to unveil plans to add generative AI capabilities to its platform -- is doing so through an integration with ChatGPT. Similarly, ThoughtSpot -- which has pre-existing LLM capabilities -- is building Sage on top of an integration with OpenAI's GPT-3.
Others, meanwhile, are building their own LLMs that they plan to use as the base for their own generative AI capabilities.
Salesforce, which is developing Einstein GPT to generate new insights, is one example. AWS with Bedrock, Google with Bard and Microsoft with its Copilots are others.
"The market is already starting to segment, and that's important," Farmer said. "It shows that there's something real here."
Those developing their own LLMs tend to be the bigger vendors, among them the tech giants that can use generative AI not only in their analytics and data management tools but also across their cloud computing platforms.
In addition, Farmer noted that others including ThoughtSpot and Domo are doing more than integrating with LLMs but not going so far as fully developing their own.
"You're seeing a segmentation between those that have actually got LLMs that they can build on, and those who are using third-party LLMs," he said. "And somewhere in the middle is ThoughtSpot, which is trying to think through the implications of LLMs without creating an LLM. Domo is doing something similar. These are signs that [generative AI] is a real market."
ThoughtSpot's Howson similarly noted that there are differences in the ways vendors are approaching generative AI. Mostly, it has to do with the vendors' focus.
For example, ThoughtSpot's platform has always been built on the concept of natural language search. The vendor, therefore, is using generative AI to make its search platform more intuitive and simpler to use.
Beyond analytics, vendors are developing industry-specific LLMs, Howson noted. Bloomberg is developing an LLM for financial services; Truveta is building one for healthcare.
"Vendors are taking widely different approaches," Howson said, noting for example that vendors are also focusing on their segments, such as ThoughtSpot for analytics and insights, Atlan and data.world for metadata creation and SnapLogic/GPT for data access.
Other analytics trends
While much of this year's product development has centered around generative AI, there have been other trends as well.
One key rising trend is an emphasis on cost control, according to Henschen.
Cloud service providers tend to charge for consumption. Customers must pay for the compute power they use and the time spent using it. Though seemingly inexpensive at just pennies per minute, cloud computing costs add up quickly when organizations process massive amounts of data and have hundreds of employees -- perhaps even thousands -- working with that data.
Doug HenschenAnalyst, Constellation Research
An important development, therefore, is that customers are looking to control the cost of analytics.
Some vendors, such as Tibco, have launched governance capabilities that enable organizations to see when they use the most compute power so they can reduce the amount of power they pay for during down times.
Others have enabled organizations to put tighter access controls in place so users can't run ad hoc queries ad nauseum and run up their organizations' costs. Still others have enabled in-database analysis so users no longer rack up data egress costs.
"The number one non-generative AI trend for the first half of 2023 was the rise of cloud cost optimization," Henschen said. "Ground zero for that trend has been cloud data warehouse costs. Customers are demanding better cost and workload management insights and analytics from their platform providers."
Some users are moving some of their data back on premises to reduce costs, he added.
Similarly, Howson cited an emphasis on controlling cloud computing costs as an important analytics trend.
"Some of this is changing behaviors in a cloud world -- throttling servers, for example -- but also, it's about vendors giving customers better visibility and controls," she said.
Beyond cost control, continued investment in data quality and other foundational needs for analytics is a key trend, according to Fisher.
Data quality is critical because organizations need accurate data to make accurate decisions. If decisions are made based on bad data, depending on the importance of the decisions, the results can be catastrophic.
History is full of examples of how bad data led to unintended consequences. In the enterprise, decisions based on bad data can lead to unintended expenses, missed opportunities and an overall lack of trust in data as a reliable informant for decisions.
With organizations collecting more data than ever -- and from more diverse and complex sources, such as IoT devices -- to fuel real-time decisions, ensuring data quality is critical.
"Organizations have been addressing the way they access and transform data in the cloud to eliminate … the larger hurdles to real-time decision-making," Fisher said. "That has required continued investments in data pipelines, data integration and data quality solutions. As organizations get these building blocks in place, they are leveraging them to [get] closer to real-time decisions."
Just as it has been to date in 2023, generative AI will be the dominant analytics trends the rest of the year, the experts said. But it will evolve.
The first half of the year was filled with product development plans. During the second half, the first wave of generative AI capabilities will hit likely the market, according to Henschen. That, in turn, will lead to imitation as vendors see the best of what their competitors are doing and become fast followers.
"As new capabilities are released, across many categories of software, we're going to see imitation and creative extensions as each wave of functionality is introduced," Henschen said. "It's going to be a fast-paced environment, with developers and data-savvy analysts being the first user groups to see productivity gains and deeper insight."
Howson likewise said that the rest of the year will be marked by generative AI innovation.
Those organizations already thinking about how they want to deploy generative AI and planning for when tools that enable those deployments are released will have an advantage over those organizations waiting to see tools in action before coming up with a generative AI strategy.
"We will see a lot of innovation, and it's an exciting time," Howson said. "Customers who are educating themselves about what is possible and deploying wisely will have a first-mover advantage. Separating what works at scale and in a trusted way will be the difference."
Meanwhile, as generative AI-powered tools are released, generative AI governance -- essentially data governance given that LLMs are built on data -- will become an emphasis, according to Fisher.
"While generative AI has the power to bring value to a wide range of the business, it has to be deployed in a way that protects your data assets," he said.
That might lead many organizations to stop using public LLMs and instead develop customized LLMs trained with just their own data, Fisher continued.
"Large and small language models that run behind the firewall or in a private cloud is where we will see the longer-term use of generative AI, since in those instances the enterprise can ensure data quality, governance and lineage are in place," he said.
The final months of 2023 will also feature the first significant generative AI failures, Farmer predicted. A resulting analytics trend, therefore, will be a bit of a reality check for generative AI.
LLMs have a "hallucination" problem. In other words, not all the responses they deliver are accurate. They also have security problems. Vendors and enterprises will put governance measures in place to protect proprietary data. But when data gets moved, mistakes can happen. Data can get leaked.
"Generative AI is going to have some failures, which may be spectacular," Farmer said. "It could be something that results in major corporate reputational harm because of something done by AI."
Already, some generative AI users are becoming dissatisfied with the limitations of its text generation capabilities and auto-generated images, he continued.
"It's entirely natural that the gloss will go off of generative AI," Farmer said. "That means that by the end of this year and beginning of next year, we'll be focusing on real use cases. Then we'll get a much better sense of the future direction of generative AI."
Eric Avidon is a senior news writer for TechTarget Editorial and is a journalist with more than 25 years of experience. He covers analytics and data management.