Getty Images

SingleStore update adds new tools to fuel GenAI, analytics

The database vendor unveiled Pro Max, a rebranded version of its platform that includes new vector search and change data capture capabilities that enable AI and BI at scale.

SingleStore on Wednesday introduced Pro Max, an updated version of the vendor's suite with a rebranding aimed at demonstrating SingleStore's evolution from database specialist to a more broad-based data platform vendor.

Among other capabilities, the update includes indexed vector search and change data capture (CDC) capabilities to simplify and speed up the development of traditional AI, generative AI and analytics applications.

Based in San Francisco, SingleStore was founded in 2011 as MemSQL. The vendor's tools work with both on-premises and cloud-based deployments and are designed to quickly ingest data from myriad sources to enable near-real-time queries and transactions.

Similar vendors include database specialists such as MongoDB and the open source MySQL database, as well as tech giants that offer database capabilities, including AWS, Google and Microsoft.

In October 2023, SingleStore unveiled vector search capabilities and a new compute layer -- among other features -- designed to enable real-time AI development.

A year earlier, the vendor added $30 million to the $116 million it raised in July to close its $146 million F-2 funding round, aimed at furthering product development, improving marketing campaigns and fueling geographic expansion.

Vector search

Vector search has emerged as a critical tool for generative AI development.

Vectors are numerical representations of unstructured data that essentially give data such as text, video and audio files structure so that it can be discovered and used to inform decisions. Vector databases, meanwhile, date back to the early 2000s.

Until the past year, however, vector search was a feature used mostly by large organizations attempting to find and operationalize unstructured data that would otherwise be left untouched in a data lake.

Vectors enable large-scale similarity searches, which help data teams discover data points that relate to one another amid millions of other unrelated data points. Those related data points can then be combined to form a data set that can be used to build and train AI models and other data products.

A chart displays the differences between traditional keyword search and vector search.
A comparison of vector search and keyword search.

With generative AI exploding in popularity in the 14 months since OpenAI launched ChatGPT, and vendors such as Google and Microsoft following with their own large language models, organizations have discovered that vector search can be used to train LLMs with their own proprietary data to help make business decisions.

In response, SingleStore and numerous other data management vendors including Dremio and Neo4j have added new vector search capabilities in recent months.

SingleStore first provided exact nearest neighbor search in 2017 and unveiled approximate nearest neighbor (ANN) search in its October update to expand the parameters of vector searches.

Indexed vector search serves to enable those ANN searches, increasing the speed and accuracy of ANN vector searches by more efficiently organizing and storing vectors. Indexing uses algorithms to predetermine the similarity of one vector to another, making similar vectors fast and easy to discover.

Given the importance of vector search as a tool for generative AI development, the indexed vector search capabilities are significant for SingleStore customers, according to Kevin Petrie, an analyst at Eckerson Group.

It helps to have one platform that can meet diverse needs for these multifaceted workflows.
Kevin PetrieAnalyst, Eckerson Group

"SingleStore helps address two primary enterprise requirements for GenAI," he said.

First, the vendor's vector search capabilities deliver unstructured data to fine-tune and prompt domain-specific language models, Petrie continued. And SingleStore's tools manage vectors alongside tables and other data types, which is important because generative AI workflows will grow to include predictive machine learning and other analytical or operational functions.

"It helps to have one platform that can meet diverse needs for these multifaceted workflows," Petrie said.

More than just serving as a conduit for generative AI development, the vendor's vector search capabilities -- in concert with other Pro Max features -- are aimed at informing LLMs in near real time, said Madhukar Kumar, SingleStore's chief marketing officer.

Speed has long been a focus for SingleStore. Indexed vector search will play a role in enabling customers to build retrieval-augmented generation (RAG) pipelines that automatically ingest unstructured data, vectorize the data and feed it into the proper data pipeline to inform an application or model.

For example, a recorded virtual meeting can be automatically transcribed and loaded into a vector database, the text assigned vectors, and the vectors fed into RAG pipelines to inform analytics and AI -- all within milliseconds, according to Kumar.

"That's what this is all about," he said. "It allows you to live RAG, which allows you to get closer to real-time AI."

Additional capabilities

Beyond indexed vector search, a key capability introduced in Pro Max is a CDC for data ingest and egress, according to Petrie.

The cost of maintaining a data infrastructure has risen in recent years as data volume continues to rise, data complexity is similarly increasing, and organizations add more tools to develop and monitor data pipelines. In particular, the cost of maintaining cloud-based data infrastructures is becoming a problem for many organizations.

SingleStore's new CDC capabilities are designed to help organizations better control the cost of moving data in and out of the vendor's database by adding native data movement between SingleStore and tools from certain vendors.

Specifically, SingleStore is adding CDC capabilities for MongoDB and MySQL and ingestion from Apache Iceberg that eliminate the need for third-party CDC tools. In addition, SingleStore is adding support for CDC capabilities that simplify data migration from SingleStore to applications including databases, data warehouses and lakehouses.

"By adding change data capture capabilities, SingleStore removes an adoption barrier by simplifying data transfer across heterogeneous environments," Petrie said. "Most enterprises still need CDC or other data pipeline tools, but this reduces the need for them for certain sources and targets."

In addition to indexed vector search and new CDC capabilities, Pro Max includes a free tier.

SingleStore offers both SaaS and self-managed pricing options. The vendor does not publicize the cost of its self-managed option, but its SaaS option comes in three tiers starting at $0.80 per hour for its Standard edition and $1.60 per hour for its Premium edition. The cost of its Dedicated edition, which addresses the needs of organizations with unique security requirements and includes more support than the Standard and Premium editions, also is not publicized.

The free tier of Pro Max is a permanent option, according to Kumar, but it comes with limitations such as having 10 or fewer databases. For some users, that might be enough. But for most, the hope is that they will upgrade to one of the other tiers.

"We want people to be able to go in and build stuff to see the value on their own," Kumar said. "If they like it, great, they can upgrade. And if they don't want to upgrade and [it's enough], that's fine as well."

Other new features include the following:

  • A new on-demand compute service for GPUs and CPUs that enables users to run database-adjacent workloads including data preparation without having to unnecessarily move data.
  • The general availability of SingleStore Kai, an API unveiled in public preview in early 2023 that is designed to deliver faster analytics on MongoDB by eliminating the need for query changes and data transformations.
  • Projections, a feature aimed at further improving the query speed of SingleStore's database by adding new sort and shard keys.

Some Pro Max features such as indexed vector search are now generally available. Projections, CDC in from MySQL, CDC out and the free shared tier are in public preview.

Each of the features, meanwhile, represents SingleStore's ongoing effort to provide tools that enable enterprises of all sizes to operationalize data at scale, according to Kumar.

"As more companies go from prototype to production, they need advanced features that are enterprise-grade," he said. "Over the past 10 years, that is what we have been building."

Petrie, however, noted that while SingleStore's new tools further advance the capabilities of the vendor's developing data platform, they aren't necessarily different from the capabilities from other data platform providers.

While perhaps more advanced than the capabilities provided by database specialists, data platform vendors including Databricks offer many of the same capabilities featured in Pro Max.

"This release is part of an industry trend," Petrie said. "A number of vendors, including Databricks and the hyperscalers, aim to support all workloads."

Future plans

With Pro Max now available, SingleStore plans to continue expanding beyond its traditional database roots toward becoming a data platform vendor like Databricks or Snowflake, Kumar said.

He noted that there are more than 300 databases, most of which are niche or general-purpose but with a specialty. Many have added vector search capabilities, he said. Most, however, are not part of a fully integrated data stack.

SingleStore aims to become a fully integrated data platform, according to Kumar. Speed, meanwhile, will continue to be a primary focus.

"We are headed toward being an integrated data platform that responds in a few milliseconds irrespective of the data type so that it can be used in AI," Kumar said.

Other than speed, helping to ensure data quality will also continue to be critical, he added.

"If someone has clean, fresh, fast-moving data, they can do things a lot faster than someone else whose data is not clean and has a lot of data movement," Kumar said. "Our vision is to be able to help companies take their data, make it actionable in milliseconds and be able to combine it with LLMs."

Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.

Dig Deeper on Data management strategies

Business Analytics
Content Management