Snowflake on Monday reached an agreement to buy Samooha to help the data cloud vendor's customers more easily develop data clean rooms.
Data clean rooms enable users to combine proprietary data with third-party data in a secure and governed manner that keeps sensitive data private, something gaining importance as organizations recognize the benefits of collaborating with partners to better inform decisions.
Snowflake already has tools that enable customers to build data clean rooms but the acquisition will improve those capabilities. As a result, the acquisition is not just a smart move for Snowflake but a needed one, according to Stephen Catanzano, an analyst with TechTarget's Enterprise Strategy Group.
In addition to Snowflake, rivals including Google and AWS offer data clean room capabilities, as do specialists such as Helios and Duality.
"Google and others have paved the way for this secure, governed exchange of internal data to partners and it will now become more mainstream for other organizations to do the same," Catanzano said. "[It was] something overdue."
Samooha, founded in 2022 and based in Los Altos, Calif., develops cross-cloud collaboration tools, including data clean rooms. The vendor raised $12.5 million in its lone funding round in February 2023, with Snowflake Ventures one of the lead investors.
The acquisition of Samooha is Snowflake's ninth of 2023, which includes its May acquisition of Neeva, a search engine specialist whose tools are driven by generative AI and large language model (LLM) technology.
The cloud data vendor did not disclose financial terms of the deal, which remains subject to customary closing conditions.
As organizations aim to improve decision-making -- which includes developing generative AI applications trained with their own data -- technologies that enable collaboration while ensuring accuracy and security take on greater importance.
For example, as organizations combine proprietary data with generative AI platforms to train those public data models to understand their business, vector search, which enables data discovery, is becoming a critical capability.
Similarly, collaboration within organizations and among partner organizations is gaining momentum, alongside data clean rooms.
"Data clean rooms have become a big deal since there needs to be a better way to share data between partnering companies," Catanzano said.
One common application of data clean rooms is brand marketing, Catanzano continued. Partners can share data, with personally identifiable information automatically obscured by the clean room to protect privacy.
Stephen CatanzanoAnalyst, Enterprise Strategy Group
Within the data clean room, collaborators can identify attributes common to top customers and then search for similar prospects within partners' data.
"It's about sharing my confidential analytics and data with partners," he said. "It's highly confidential, very targeted and more. If you buy data from Google, you now can only get it in a clean room. Compliance and governance were a big driver."
Jay Piscioneri, an analyst at Eckerson Group, likewise said Snowflake's acquisition of Samooha was a wise move.
In particular, he noted Samooha's cross-cloud data sharing capabilities and easy-to-use experience. Many organizations store data on more than one cloud and there's no guarantee that a partner stores its data on the same clouds.
"Setting up clean rooms can be complicated, especially when each party's data is stored on different cloud platforms," Piscioneri said. "Samooha's … simplified user experience makes it easier for non-technical users to set up and run a data clean room."
Data clean rooms are not a common part of the generative AI process, but they could become one as more organizations develop their own private language models, according to Piscioneri.
They don't make sense for public LLMs such as ChatGPT and Google Bard that are trained on public data. However, they could make sense for models trained using private data, adding to the significance of Snowflake's acquisition of Samooha.
"It's too early to say [whether data clean rooms will benefit GenAI]," Piscioneri said. "But as use grows of small language models that constrain GenAI to an enterprise's data without releasing it into the wild, GenAI will likely become part of the user experience of the clean room."
Like Databricks, which is perhaps Snowflake's closest rival, Snowflake has prioritized generative AI in the year since OpenAI released ChatGPT.
Beyond its acquisition of Neeva in May, Snowflake also added new containerization capabilities that enable users to access generative AI software and unveiled the private preview of its own LLM.
If collaboration between partners becomes common as organizations develop their own language models, data clean rooms could become part of the generative AI development process just as vector search has evolved to become critical to training LLMs.
"[Data clean rooms] could be a part of this as more LLMs are created and those data insights shared," Catanzano said.
Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.