metamorworks - stock.adobe.com

Collibra's acquisition of Deasy targets unstructured data

With AI development on the rise, the vendor's latest purchase better enables customers to combine the complete array of relevant data to inform advanced applications.


Listen to this article. This audio was generated by AI.

Collibra on Thursday acquired Deasy Labs to improve its unstructured data governance capabilities and better enable customers to develop a trusted foundation for developing analytics and AI applications.

Financial terms of the deal were not disclosed.

Founded in 2023 by developers from consulting firm McKinsey & Co., Deasy Labs is a New York City-based startup that provides AI-powered metadata management tools that automate governing unstructured data.

Collibra, a metadata management specialist based in Brussels and New York City, already enables users to govern unstructured data based on its metadata with its Collibra Platform. However, classifying and filtering unstructured data files has to date been either a manual process, or Collibra users have had to use third-party platforms through integrations with Collibra partners.

Once Deasy's capabilities are integrated with the Collibra Platform, Collibra users will be able to automate much of the classifying, filtering and enriching of unstructured data. Subsequently, they can unify their governance of structured and unstructured data to feed AI and analytics initiatives.

As a result, Collibra's acquisition of Deasy adds important capabilities, according to Sanjeev Mohan, founder and principal of analyst firm SanjMo.

"Thus far, we didn't have the right tools to extract intelligence easily from [unstructured data], as it was a manual, error-prone process that couldn't scale," he said. "Now, with AI, we have the tools to analyze this kind of data at scale. Collibra can combine their prowess curating metadata from structured data and apply it to this 'dark data.'"

Donald Farmer, founder and principal of TreeHive Strategy, likewise praised the acquisition.

"I think the acquisition is super interesting because access to unstructured data and the ability to catalogue and govern it has always been an issue," he said. "Collibra customers I have worked with have not necessarily seen this as a weakness, but an area in which they would like more investment. They will be delighted to see this acquisition."

In addition to Collibra, vendors such as Snowflake and Qlik are prioritizing access to unstructured data.

Unlocking unstructured data

Though enterprises have deployed analytics platforms to inform decisions for decades, the data used as the basis for those decisions has largely been structured data such as financial records and point-of-sale transactions.

Unstructured data such as text, images and audio files is exponentially more difficult to analyze. Manually combing through PDFs, emails, phone calls, videos and more to find insights is nearly impossible, and until the rise of generative AI (GenAI) over the past few years, most data management technology wasn't advanced enough to automate parsing unstructured data.

Therefore, although unstructured data makes up the vast majority of all data -- estimates range between 80% and 90% -- most unstructured data went unused.

Now, spurred by enterprises increasing their investments in AI development, there's growing interest in unstructured data.

AI tools, including GenAI and agentic AI applications, are prone to hallucinations. Large amounts of high-quality data reduce the likelihood of hallucinations. Beyond making AI outputs more trustworthy, combining structured and unstructured data provides organizations a more comprehensive view of their operations than structured data alone.

Kevin Petrie, an analyst at BARC U.S., noted that his firm's research shows strong adoption of structured data for AI initiatives, but only one quarter to one third of organizations are also using unstructured data.

With unstructured data an important means of improving AI quality, adding technology that makes it possible to govern and operationalize unstructured data and structured data is significant. As a result, Petrie called Collibra's acquisition of Deasy a smart move.

"A modern AI initiative should include multiple model types, consuming multiple data types," he said. "This makes it critical for data teams to catalog all their data assets and models together."

Competitive advantages from AI result when enterprises can apply AI models to proprietary data sets, Petrie continued. If those data sets contain insights from unstructured objects, enterprises can derive greater context than when the data sets contain structured data alone.

"This acquisition will enable Collibra users to organize and prepare unstructured data for AI model training and inference," he said.

Deasy Labs' technology connects directly to unstructured data sources, automatically detects taxonomies to classify data within those files and enriches the files with structured metadata so they can integrate with structured data to feed AI and analytics tools.

Specifically, Collibra's existing capabilities combined with those from Deasy will enable the following:

  • Automated semantic modeling that classifies and filters unstructured data to give it structure so it can be searched.
  • AI-powered discovery of relevant data based on semantic tagging.
  • Aiding the sustained performance of AI tools with larger data volumes.

Though all are valuable, automatic semantic modeling is perhaps the highlight feature Collibra's acquisition of Deasy will add, according to Mohan.

Farmer, meanwhile, noted that despite automating much of the metadata management of unstructured data, Deasy's tools also require human oversight to ensure the accuracy of AI-generated classifications and filters and approve them.

I think the acquisition is super interesting because access to unstructured data and the ability to catalogue and govern it has always been an issue.
Donald FarmerFounder and principal, TreeHive Strategy

With human involvement and AI integrated into metadata management, Deasy's capabilities complement Collibra's existing ones.

"I think this is a good acquisition," Farmer said. "It shows a strategic commitment from Collibra to expand their user base."

Customer feedback gave Collibra the impetus for the acquisition, according to Felix Van de Maele, the vendor's CEO.

Meanwhile, Collibra's decision to acquire Deasy Labs' unstructured data governance capabilities rather than develop similar capabilities on it was driven by the time needed to build from scratch compared with buying ready-made tools, Van de Maele continued.

"Acquiring Deasy Labs allowed us to leap ahead with proven GenAI-native technology and talent," he said.

A graphic displays the various types of unstructured data.

Next steps for Collibra

Over the second half of 2025, adding more automation capabilities to better enable customers to govern data and AI will be a focal point for Collibra, according to Van de Maele.

Mohan, meanwhile, suggested that Collibra use the capabilities it inherited through its acquisition of Deasy to develop agentic AI-powered unstructured data management capabilities tailored to the needs of specific industries.

"Deasy Labs can help in agentic AI use cases for vertical industries such as from banking documents to call transcripts," he said.

Eric Avidon is a senior news writer for Informa TechTarget and a journalist with more than 25 years of experience. He covers analytics and data management.

Dig Deeper on Data management strategies