your123 - stock.adobe.com
Informatica unveils plan to infuse Claire with generative AI
The longtime independent vendor plans to infuse its existing AI engine with generative AI to simplify data management by enabling customers to use natural language.
Informatica unveiled Claire GPT, an integration between the vendor's augmented intelligence engine and generative AI capabilities that will enable Informatica customers to manage their data using natural language.
Informatica is a data management vendor whose platform is the Intelligent Data Management Cloud (IDMC). Claire, meanwhile, was developed in 2017. It is the AI and machine learning engine that that is built into individual data management tools that make up the IDMC to unify metadata and automate metadata management.
Now, through its partnerships with OpenAI and Microsoft -- which began investing in OpenAI in 2019 -- Informatica is augmenting Claire with generative AI in an attempt to make enterprise data management available to a broader array of users with natural language processing enabled by OpenAI's large language models.
Claire GPT, unveiled on May 9, is scheduled for general availability during the second half of 2023. It was introduced along with a host of other new features during Informatica World, the vendor's user conference in Las Vegas this week.
Informatica and generative AI
Historically, enterprise data management has required data engineers to write code to build data pipelines and integrate their organization's data. Informatica recently launched a no-code data integration tool designed to enable non-technical business users to develop their own data pipelines.
But even the myriad no-code tools developed by data management and analytics vendors in recent years to enable some level of self-service use required some level of data literacy. Commands and queries, at the least, needed to be phrased in a highly specific way to be understood by natural language tools.
Large language models (LLM), however, boast much bigger language libraries than traditional NLP tools and can enable more freeform language.
As a result, numerous analytics vendors have developed integrations with OpenAI since its November release of ChatGPT, including ThoughtSpot and Tableau. Among data management vendors, Databricks developed its own LLM.
Now, Informatica is joining the fray in a move aimed at advancing its existing AI and ML capabilities, including intelligent automation.
Claire has been learning from big data use since its inception as it fuels data management at scale, noted Stewart Bond, an analyst at IDC. But adding generative AI has the potential to boost its intelligence by enabling it to learn from more than just trained data experts, which is significant.
"In the middle of Claire's name is AI," he noted. "Claire has been learning from metadata and augmenting and assisting the work of data engineers, data stewards and data analysts for some time now. Adding GPT is taking that one step further, expanding its generative capabilities."
But whether the generative AI-infused tools being developed by data management and analytics vendors actually deliver on their promise remains to be seen, Bond continued.
As with Informatica's version, none are yet generally available. Each vendor has said that it will be late 2023 before their generative AI capabilities are released to the public.
"Informatica, along with other vendors in the data intelligence and integration space, have been waving the 'intelligent automation' flag for some time and calling all of it AI," Bond said. "One of the benefits of the ChatGPT phenomenon is that organizations will now start evaluating the 'AI' capabilities of these products. That is, they will be looking for real AI and not just rules-based automation."
While generative AI has the potential to enable more people to work with data, ChatGPT and other LLMs have raised security concerns.
ChatGPT suffered a recent data breach. As a result, Italy and China are two countries that have banned its use (Italy has since lifted its ban). Other others are considering similar measures until more is known about the technology. In addition, the inaccuracy of some of the responses generated by LLMs is a worry for many users and potential users.
Informatica aims to address those concerns by limiting Claire GPT to the constrained task of data management, according to Jitesh Ghai, Informatica's chief product officer.
Stewart BondAnalyst, IDC
"In many ways it feels like the early days of the internet," he said. "There's a lot of potential, but there are a lot of governance-related concerns and a lot of intellectual property-related concerns. We're simplifying that by taking large language models and focusing on the singular, hard problem of data management for enterprise data."
One of the key benefits of Claire GPT is that it simplifies data management even for the trained data engineers generally tasked with developing overseeing their organization's data pipelines, Ghai continued.
The tool provides a prompt-based interface for data management in which a user can ask to be connected to an application such as Salesforce. Users can then tell the tool to pull certain types of data and aggregate it in a particular way -- for example, monthly customer data -- apply data quality and governance measures, and then load it into a data warehouse such as Snowflake.
That can be done in a matter of sentences.
In the past, however, that would have taken one data engineer with a data ingestion tool to connect to Salesforce and load it into Snowflake. It would have then taken another data engineer with an extract, load and transform tool to curate the data in Snowflake. Finally, it would have taken a third data expert to ensure data quality.
"Now, one individual, with three sentences, is able to convey what they would like done with regards to data management, and then Claire does it," Ghai said.
Beyond Claire GPT, other new features unveiled by Informatica include the following:
- IDMC for ESG Sustainability, an industry-specific version of IDMC that includes prebuilt capabilities and data sets related to environmental, social and governance. It is aimed at helping organizations meet ESG regulations and support ESG initiatives.
- Cloud Data Integration for PowerCenter, a service designed to help users of PowerCenter -- Informatica's data virtualization platform -- more quickly and easily migrate their on-premises data to IDMC.
- New collaboration with Microsoft to provide Azure customers with Informatica's capabilities in a more native way.
- The native launches of Informatica's Intelligent Master Data Management SaaS on Google Cloud and IDMC on Google Cloud in Europe.
- An enhanced relationship with AWS that includes go-to-market efforts and accelerated cloud transformation capabilities for joint customers.
- An expanded partnership with consulting and technology firm ZS that will embed IDMC into the Zaidyn platform from ZS built for life sciences.
Bond noted that the effort to make cloud migration faster and easier addresses a lingering need.
"These new offerings are accelerating migrations from PowerCenter into the Informatica cloud, which was the biggest challenge for existing PowerCenter customers," he said.
In addition, Bond highlighted the importance of assisting enterprises in their ESG-related efforts by enabling them to capture and cleanse data that can be used for ESG reporting as well as use their data to estimate their own environmental impact and manage emissions.
However, the power to train large AI models can have an environmental impact. While Informatica is attempting to help customers be more environmentally conscious, it is also potentially enabling its users to create more environmental damage, he noted.
"It looks good on Informatica to help customers tackle [ESG] problems," Bond said. "What is a little ironic is that Informatica is increasing its use of AI when we also know that the power required to train large AI models can have a significant environmental impact. At a minimum, we could hypothesize that Informatica is introducing new ESG capabilities to offset the increased use of power for AI."
As Informatica plots its roadmap, it remains focused on the task of simplifying data management even as the amount of data organizations ingest is increasing and the data they collect is becoming more complex, according to Ghai.
For the vendor, that means continuing to invest in AI.
"There is enormous opportunity for our customers to drive their AI-powered digital transformation," Ghai said. "It's no longer just digital transformation. Now it's about AI-powered digital transformation. That means enabling our customers to leverage the cloud, trusted data and trusted AI."
Bond, meanwhile, noted that Informatica is one of the last remaining last data intelligence and integration vendors. Other independent vendors have either branched out and now offer additional capabilities or have been acquired.
For example, Talend is being acquired by Qlik, and Alteryx acquired Trifacta to add data wrangling capabilities.
Informatica's balance sheet appears healthy, as evidenced by recent first quarter 2023 earnings that included annual recurring revenue growth of 20% over the first three months of 2022. But going forward, the vendor may need to broaden its portfolio.
"There are [vendors] that focus on just data integration and just data engineering," Bond said. "But Informatica's closest competitors that have both sets of capabilities also have broader software portfolios. There has been consolidation in the market. It will be interesting to see how Informatica responds."
Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.