Getty Images/iStockphoto

Generative AI capabilities highlight latest Denodo update

The data virtualization specialist's platform update includes natural language processing capabilities that reduce repetitive tasks as well as some complexities of data management.

New generative AI features designed to improve the productivity of data workers highlight the latest Denodo platform update.

Included in the latest version of Denodo Platform, which was made generally available on Oct. 31, are natural language processing (NLP) capabilities powered by integrations with generative AI platforms ChatGPT and Microsoft Azure OpenAI, new data catalog tools that enable collaboration, and improved parallel processing capabilities to improve scale while keeping costs under control.

Overall, the update includes a strong combination of capabilities with the new generative AI capabilities at the forefront, according to Sanjeev Mohan, founder and principal of SanjMo.

"It includes a very important first step with generative AI," he said. "But it is only the beginning of the journey."

Based in Palo Alto, Calif., Denodo is a metadata management vendor with a platform that connects data across a data virtualization architecture.

Data virtualization is an approach to data integration and management that enables organizations to combine disparate types of data, such as structured and unstructured, without forcing users to first manipulate that data to make it uniform.

Among its primary features is that it enables organizations to develop a single representation of their data without having to copy or move data.

In addition to Denodo, AtScale and Datameer are data virtualization specialists. Other data management vendors with data virtualization capabilities included in broader platforms include Informatica, IBM, Oracle, SAP and Tibco.

Denodo's platform update comes about six weeks after the vendor raised $336 million in equity funding from TPG Growth.

New capabilities

Two of the main benefits of generative AI are its potential to increase data workers' productivity and its potential to make tasks that were previously too complicated for business analysts accessible to non-experts by eliminating complexities such as coding.

Generative AI enables freeform NLP given the vast vocabularies of large language models (LLMs).

Numerous data management and analytics vendors have developed NLP capabilities in recent years. But their tools had limited vocabularies and required to users to have the requisite training to phrase queries and commands in precise, business-specific ways. In essence, they were a code of their own.

Now, however, by integrating with LLMs such as ChatGPT, Google Bard and Azure OpenAI, data management and analytics vendors are enabling customers to query and model data in conversational language rather than SQL or other coding languages. As a result, more than just trained data experts within organizations can work with data.

For years and years, we've been talking about democratizing data. Generative AI actually democratizes data because any user with the right permissions can now ask questions of their data in their natural language and the product can now translate it to [code]. You no longer have to know SQL.
Sanjeev MohanFounder and principal, SanjMo

Meanwhile, by reducing the need to write code, LLMs are making data experts more efficient, freeing them from the time-consuming task of generating the massive amounts of code it takes to model data and develop data products.

Denodo's new integrations with ChatGPT and Azure OpenAI enable natural-language access to the metadata housed within the vendor's platform.

Because generative AI enables more people than just trained data experts to work with data while making data experts more efficient, Denodo's integrations with ChatGPT and OpenAI are significant for the vendor's customers, according to Mohan.

"For years and years, we've been talking about democratizing data," Mohan said. "Generative AI actually democratizes data because any user with the right permissions can now ask questions of their data in their natural language, and the product can now translate it to [code]. You no longer have to know SQL. That's why I feel generative AI is going to be a game changer."

Ravi Shankar, Denodo's senior vice president and chief marketing officer, likewise noted the importance of making data exploration and analysis possible for more than just trained data experts as the main benefit of the vendor's initial generative AI capabilities.

"This enables democratization of data to business users who are not technical," he said. "They don't have to learn or use SQL skills to get their data."

One of the biggest concerns many organizations have about integrating with public LLMs is that their data will get exposed in a data breach, such as the one ChatGPT suffered last spring.

However, because Denodo deals only with metadata, customers' data is safe when using the vendor's new generative AI capabilities, according to Shankar.

"We do not expose the underlying data," he said. "The users still have complete intellectual control of the data itself. In addition, Denodo comes with security capabilities, so users can individually set by user what data they can access. This is a big topic, and we eliminate [security risks] by dealing with the metadata and not the actual data."

Beyond the new NLP capabilities enabled by the integrations with ChatGPT and Azure OpenAI, the latest Denodo Platform update includes the following:

  • New data catalog features that enable self-service users to author, deliver and share datasets both within and outside their organization using a drag-and-drop interface.
  • Embedded massive parallel processing (MPP) capabilities based on Presto, an open source SQL query engine, that improve performance when processing large data volumes.
  • A new financial operations dashboard. Users can view the cost of running workloads on Denodo's platform as in real time so they can better monitor and control cloud computing costs, which can often exceed expectations when not closely watched.
  • Improved data governance features, including synchronization with Collibra's Data Intelligence Cloud, that simplify managing fine-grained access control policies to ensure data privacy and regulatory compliance.

While each has value, Mohan noted that the data governance and cost management capabilities are particularly important.

"Denodo is taking security to the next level," he said. "Now Denodo can look at my attributes and show me only the data I'm entitled to see. Also, financial operations stands out. We are still facing some serious economic headwinds, so anything and everything a product vendor can do to optimize resource usage and recommend alternatives is really important."

The impetus for including features such as natural language query, MPP and a way to monitor costs as they incur came from a combination of customer feedback, monitoring of the data management market and internal product development plans, according to Shankar.

Future plans

With the latest Denodo Platform update now available, the vendor's future product development plans include developing an SaaS version of its platform, according to Shankar.

"We're currently piloting that with some of our customers right now and hope to release it early next year," he said.

Mohan, meanwhile, noted that while Denodo's integrations with ChatGPT and OpenAI represent a solid initial incorporation of generative AI, he'd like to see the vendor do more.

In particular, Mohan said Denodo needs to add vector search and vector embedding capabilities.

Generative AI models need to be trained on massive amounts of data to ensure their accuracy. Unlike other AI and machine learning models, generative AI models will deliver outputs to queries regardless of whether it has the proper information to inform those outputs. When they don't have the proper information, they deliver AI hallucinations.

Therefore, the more data organizations can use to train generative AI models, the more likely it is that the model won't deliver hallucinations.

Vectors enable organizations to train models with all types of data. They give value to unstructured data, enabling unstructured data, such as text and audio files, to be combined with traditional structured data, such as financial records, to provide a broader set of data points to train generative AI models.

In addition, vectors enable similarity searches so organizations can discover as much data as possible that can be used to inform a particular generative AI model.

Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.

Dig Deeper on Data management strategies

Business Analytics
Content Management