Getty Images/iStockphoto

Couchbase unveils vector search that extends to the edge

The vendor's new capabilities will enable users to feed and train AI models, and through an embeddable version of its database extend to edge devices for real-time insight.

Couchbase on Thursday unveiled vector search capabilities in a move aimed at helping customers develop AI models and applications, including generative AI, using their own proprietary data.

Based in Campbell, Calif., Couchbase is a database vendor that provides the Couchbase Capella database-as-a-service platform, first launched in 2021, for its cloud-based customers and Couchbase Enterprise for on-premises users.

Vector search is now in preview, and Couchbase expects to make the capabilities generally available for both its cloud and on-premises customers in May.

With vector search now viewed as a critical component of the generative AI development process, numerous database vendors have already introduced vector search and storage capabilities. Database specialists MongoDB and Rockset are Couchbase peers that previously unveiled vector search capabilities. In addition, data platform vendors including Databricks, Google and Snowflake have all added vector search.

However, the surge in vector search's popularity over the past year is recent enough that Couchbase is not at a competitive disadvantage, according to Stephen Catanzano, an analyst at TechTarget's Enterprise Strategy Group.

Everyone started announcing vector search at the end of last year. Some already had it and started making it better, and others announced it and built it. But I don't think [Couchbase is] behind.
Stephen CatanzanoAnalyst, Enterprise Strategy Group

"Everyone started announcing vector search at the end of last year," Catanzano said. "Some already had it and started making it better, and others announced it and built it. But I don't think they are behind."

In fact, Couchbase's vector search capabilities are somewhat differentiated from those introduced by other vendors, he continued.

Unlike vector search that is limited to the on-site database environment, Couchbase's vector search functionality extends across clouds and to mobile and IoT devices via Couchbase Lite, a document database that can be embedded into edge devices to enable real-time decisions.

"I like Couchbase's [approach] of empowering developers with a new application development angle," Catanzano said.

In concert with unveiling vector search capabilities, Couchbase also revealed new integrations with AI development frameworks LangChain and LlamaIndex.

Vector search

Vector search dates to the early 2000s. However, its popularity has surged over the past year because of the rise of generative AI as a means of enabling data management and data analysis.

In particular, vector search is a key component of retrieval-augmented generation (RAG) pipelines that enable enterprises to train generative AI models with their own proprietary data.

Large language models (LLMs) such as ChatGPT and Google Gemini enable people to ask questions using natural language and receive responses. In addition, the models are able to generate images and code when prompted and can be trained to automate tasks.

For those reasons, when integrated with data management and analytics tools, they have the potential to make analytics use more widespread within organizations and make data experts more efficient.

However, some of the most popular LLMs are trained only on public data. They can quickly answer a question about the Great Depression or the American Revolution, and they can even write prose and poems on their own. But they don't know anything about a private business because they haven't been trained on that business's data.

Therefore, to make language models useful for business purposes, they need to train on proprietary data.

RAG pipelines feed that data from databases and other data storage types into language models. Vector search, meanwhile, is the means of discovering the relevant data for those RAG pipelines.

Vectors enable similarity searches so that large amounts of data can be used to train models. They also are a means to give structure to unstructured data types such as text, images and audio files so that they can be combined with structured data and used to feed RAG pipelines and generative AI models.

Couchbase's vector search capabilities enable the following:

  • Similarity and hybrid search so that customers can combine disparate data types to train models and applications.
  • RAG to improve the accuracy of AI models and applications by feeding them high volumes of data as well as up-to-date data.
  • Low response latency in part due to new columnar capabilities now in private preview that enable users to concurrently move data.
  • Support for graph relationship traversals that help map neural networks and establish hierarchies for similarities within data.

In addition, by extending vector search to edge devices through Couchbase Lite, Couchbase aims to enable enterprises to run AI applications anywhere.

It's those embedded capabilities that differentiate Couchbase's introduction of vector search from those of most other vendors, according to Catanzano.

"Businesses are racing to build hyperpersonalized, high-performing and adaptive applications powered by generative AI that deliver exceptional experiences to their end users," he said. "This is Couchbase's push on vector search, which is to power mobile apps and IoT devices with adaptive decision-making."

A chart displays the differences between traditional keyword search and vector search.
A comparison of vector search and keyword search.

Scott Anderson, Couchbase's senior vice president of product management and business operations, likewise pointed to embedded capabilities as a way for Couchbase to differentiate its new offering from those unveiled by other data management vendors.

"We think that unlocks some interesting use cases for customers ... and can help their applications return better and more accurate results," he said.

Retail and field services are two examples of potential applications for vector search at the edge.

Meanwhile, the impetus for adding vector search capabilities came from a combination of customer feedback and Couchbase's own monitoring of data management trends, according to Anderson.

"When we look back 12 to 14 months, that's when the new era of AI hit, and it was very obvious to us that was an area we needed to invest in," he said. "That was then validated over the last few quarters by customers talking about [AI], talking about their use cases and talking about hybrid search."

Future plans

As Couchbase moves forward following the introduction of vector search and the integrations with LangChain and LlamaIndex, adding more integrations to broaden the Couchbase ecosystem will be a focal point, according to Anderson.

"That is critically important for us," he said.

In addition, part of Couchbase's roadmap is developing a feature store where customers can easily access tools commonly used during development of AI and machine learning products.

Catanzano, meanwhile, said Couchbase could expand its offering by adding more support for graph technology.

Graph databases, like vector databases, enable users to develop neural networks that connect data beyond exact matches. Traditional relational databases allow data points to connect to just one other data point at a time, limiting the scale of searches. Graph databases, however, enable data points to connect to multiple other data points to better discover relationships between data.

Vendors including Neo4j and TigerGraph specialize in graph technology.

"Possibly graph capabilities [would be a way to expand]," Catanzano said. "They complement vector by looking at data more three-dimensionally, and the graph database vendors believe that will be the future."

Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.

Dig Deeper on Data management strategies

Business Analytics
Content Management