Getty Images/iStockphoto

Rockset adds vector embedding support to real-time database

The real-time database vendor now enables users to search and combine unstructured data with structured and semi-structured data to provide more in-depth modeling and analysis.

Database vendor Rockset on Tuesday unveiled support for vector embeddings in a move aimed at enabling users to search and operationalize any type of data in real time.

Rockset, based in San Mateo, Calif., previously supported structured and semi-structured data that let users search and analyze data at scale in real time using SQL and NoSQL.

Now with the addition of support for vector embeddings, Rockset also enables users to search and analyze unstructured data as well as combine unstructured data with structured and semi-structured data.

In late 2020, the 2016 startup raised $40 million in venture capital funding to raise its total financing to over $60 million. Less than a year later, Rockset launched features that enabled customers to directly query an event stream. Since then, it has worked with vendors including Microsoft, Oracle and Snowflake to build integrations and connectors.

Recently, Rockset, whose founders Venkat Venkataramani and Dhruba Borthakur both came from Facebook, reported that its annual recurring revenue tripled in 2022 and its customer base more than doubled during the year.

New capabilities

Vectors are essentially numerical representations of unstructured data such as text, images and videos that can't be captured in rows and columns in the same manner as structured data. Once converted to a numerical representation by an algorithm, vectors are commonly used in semantic searches so that users can discover other data with similar attributes.

That, subsequently, enables organizations to combine unstructured data with other types of data to get a more complete view of their operations. Enabling that more complete view is part of what make vectors an important part of modern analytics, according to Stephen Catanzano, an analyst at TechTarget's Enterprise Strategy Group.

"Vectors are significant in data analytics because they provide a powerful and efficient way to represent and analyze large amounts of data," he said. "They allow us to measure similarities and differences between data points and provide a rich set of tools for analyzing and manipulating data. It's almost like going from a single dimension to 3D [by highlighting] how data interconnects."

Specific capabilities now enabled by Rockset's support for vector embeddings include the following:

  • Using SQL to join the results of vector searches with other data to develop more complete real-time AI and machine learning models.
  • Indexing real-time data at high velocity.
  • Generating rapid results from searches that combine vectors, keywords and metadata.

Catanzano noted that while Rockset competes against vendors such as Elastic, Rockset is differentiating itself by enabling customers to manage all types of data, and do so at scale.

"It's significant for organizations to be able to combine all types of collected data, and process and model it to create powerful new data insights in real time," he said. "The key for Rockset is to do everything in real time at scale and deliver real-time insights."

Different types of unstructured data
A look at the different types of unstructured data.

Similarly, Venkataramani – who, in addition to being Rockset's co-founder, serves as CEO -- said the key aspect of adding support for vector embeddings is that the vendor now enables users to manage and explore all types of data in a single location.

"One single database can now store your structured data, semi-structured data and your vector embeddings to build rich AI applications," he said. "We are already a database that's good at storing structured data and semi-structured data and combining them to build real-time applications. Now with native vector support, you are now able to build applications that [enable] hybrid searches."

Real-world application

One of the main use cases enabled by combining vector embeddings with structured and semi-structured data in a hybrid application is real-time personalization for e-commerce, Venkataramani continued.

Each product on a website contains both images and text and can be encoded into a vector. Likewise, each customer can be assigned a vector based on the set of products they looked at and purchased.

They allow us to measure similarities and differences between data points and provide a rich set of tools for analyzing and manipulating data. It's almost like going from a single dimension to 3D [by highlighting] how data interconnects.
Stephen CatanzanoAnalyst, Enterprise Strategy Group

Those vectors can then be combined with other data, such as which products are in stock, to filter out data that's not relevant in the moment.

From the combination, e-commerce vendors can discover the likelihood of a customer wanting to buy a particular product while also making sure the product being pushed out to the customer is in stock or the most up-to-date version of the product.

"That's how every personalization engine works these days," Venkataramani said. "What's happening is that you have metadata about whether an item is in stock and the vector data about the likelihood of a logged-in user being interested."

In fact, Venkataramani noted that the primary reason Rockset added support for vector embeddings was a request from an e-commerce vendor that needed to personalize recommendations for its customers.

The company was doing its own ad hoc work to combine vectors with Rockset's metadata filtering capabilities and asked if Rockset could build its own support for vector embeddings.

The two then worked together to develop Rockset's new support for vector embeddings.

"The customer came to us and said, 'You have to build this for us,'" Venkataramani said. "It was a natural extension of what we [do] and a good functionality to add for so many of our customers."


Now that Rockset has added support for vector embeddings, it plans to make the vector search process faster, according to Venkataramani.

One way of accomplishing that will be to develop similarity indexes to show users which vectors are similar to others without requiring searches to find exact matches.

"There is a lot of interest in creating similarity indexes so vectors aren't packaged with just one or two others," Venkataramani said. "We want to support any kind of similarity index that a user want to build so they can natively do approximate vector searches."

Catanzano, meanwhile, said it's important that Rockset continue to respond to its customers' needs.

He said the vendor is already well established in the market for real-time cloud databases at scale. Therefore, without obvious new capabilities to add, using customer needs to guide its roadmap is smart for Rockset.

"This announcement is a good example of how they are listening to the customers and innovating themselves to keep up," Catanzano said.

Eric Avidon is a senior news writer for TechTarget Editorial and is a journalist with more than 25 years of experience. He covers analytics and data management.

Dig Deeper on Database management

Business Analytics
Content Management