Informatica is continuing to move its data services into the cloud, with the newest offering set to be the Cloud Data Governance and Catalog service.
Informatica previewed the new cloud capabilities on July 13, with general availability expected later this quarter.
The data governance and data catalog capabilities services will be part of Informatica's Intelligent Data Management Cloud (IDMC) that was introduced in April. The Intelligent Data Management Cloud provides services an organization might need to effectively use data for business operations including master data management, analytics, business intelligence and artificial intelligence.
The new Cloud Data Governance and Catalog service will help bolster Informatica's position in the data management and integration sector, said Daniel Elman, an analyst at Nucleus Research.
Elman noted that demand for data governance controls is growing as analytics and data integration take high priority in IT offices around the world. The demand has been accelerated by the pandemic, which caused distributed teams to require more agile digital tools to enable collaboration at scale on data analysis, management and integration projects while still providing centralized control and accountability.
"Demand for these capabilities is only set to increase, as the majority of companies are very early on in deploying modern enterprise data management frameworks and strategies," Elman said.
How Cloud Data Governance and Catalog affects machine learning
One of the applications for the new service is to enable machine learning workloads. Machine learning works by training on data, and understanding and controlling that data is often a challenge.
There are often many handoffs between the starting point of finding the data organizations are using to the creation of a machine learning output, which leads to compliance, governance, security and trust problems, according to Hyoun Park, CEO and chief analyst at Amalgam Insights.
"Informatica is providing more stringent governance to reduce the black box mysteries of machine learning and create more transparency not only to where the data comes from, but how it is being used, and how data sources change over time as their use for analytics and machine learning changes," Park said.
The evolution of data catalogs in the cloud
Data catalogs are not a new market for Informatica, as the vendor has an on-premises Enterprise Data Catalog in its product portfolio already.
Daniel ElmanAnalyst, Nucleus Research
David Corrigan, Informatica's general manager for data governance, quality and privacy, explained that the new data catalog service is an evolution of the on-premises platform for the cloud. The IMDC provides the ability to more tightly integrate with other data services in platform, providing a common data profiling, administration and metadata layer.
Various cloud providers typically offer some form of native data catalog service.
For example, Amazon has the AWS Glue data catalog service. Microsoft the Azure Data Catalog and Google its own Data Catalog service. Corrigan said Informatica's service can enable what he described as a catalog of catalogs, which will help organizations across multi-cloud deployments get a more complete understanding of all the data they have.
From Axon to Cloud Data Governance
In the on-premises world, Informatica's data governance technology is called Axon Data Governance. With the new cloud service, Corrigan emphasized that Informatica is not just bringing the same capabilities from Axon to the cloud but is also more tightly integrating governance with the data catalog capability.
A core element of the new cloud service is data lineage capabilities to help users understand where data came from. Informatica gained some of its data lineage technology in July 2020 with the acquisition of Compact Solutions, which brought metadata management and scanning functions.
"As clients are bringing together data from all over the place into a central system like a data lake, the questions of how did the data get there, what was it joined with and how was it transformed are becoming more important," Corrigan said. "And obviously as data is moving from so many different places, it's becoming harder for organizations to answer."