ScyllaDB adds vector search to managed database platform
The feature's direct integration in the X Cloud platform is aimed at simplifying AI development by eliminating the need for customers to add specialized databases.
ScyllaDB on Tuesday launched vector search and storage capabilities in its X Cloud database platform.
Vector search and storage are now fundamental to many AI pipelines because they enable enterprises to operationalize unstructured data as well as make traditional structured data more easily discoverable.
Vector embeddings are algorithmically assigned numerical representations of data that symbolize data's characteristics such as semantic definitions and relationships to other data. As a result, they foster similarity searches that help developers and engineers find more relevant data to train AI tools -- which require far greater volumes of data than traditional analytics tools to be accurate and reliable -- than they can with other search types including exact matches.
Given their vital role in AI development, coupled with many enterprises substantially increasing their investments in building AI tools since OpenAI's November 2022 launch of ChatGPT marked significant improvement in generative AI (GenAI) capabilities, data management vendors en masse have added vector search and storage capabilities.
For example, hyperscalers AWS and Oracle made vector search and storage key components of their database platforms while more specialized providers such as Databricks and MongoDB did so as well. In addition, vendors such as Pinecone, Milvus and Chroma are vector database specialists.
Now, ScyllaDB is making vector search and storage part of its X Cloud platform in a move that brings the vendor's fully managed database in line with those of competitors, according to Devin Pratt, an analyst at IDC; the vendor launched vector search and storage capabilities in its self-managed database on Jan 8.
Vector search is becoming standard for AI applications, and ScyllaDB's approach is to deliver it at scale inside the operational data platform teams already run.
Devin PrattAnalyst, IDC
"Vector search is becoming standard for AI applications, and ScyllaDB's approach is to deliver it at scale inside the operational data platform teams already run," he said.
Of particular value to ScyllaDB customers is that vector search and storage within X Cloud eliminates the need to run a separate vector database, Pratt added.
Matt Aslett, an analyst at ISG Software Research, similarly noted that ScyllaDB's addition of vector search and storage provides users with capabilities that have become critical components of AI development pipelines.
"Vector search and retrieval have quickly become a table stakes requirement for data platform providers to support use cases involving generative AI," he said.
ScyllaDB's vector search and storage capabilities are built on the vendor's shard per-core architecture, a configuration common to distributed databases such as ScyllaDB that optimizes resource allocation and fosters scalability by dividing workloads into small, independent units called shards.
Informa TechTarget
ScyllaDB's Vector Store service automatically updates vector embeddings through change data capture capabilities and builds approximate-nearest-neighbor indexes within its main memory.
The move to add vector search and storage to X Cloud closely follows ScyllaDB's Jan. 15 X Cloud update and was motivated by a combination of customer feedback and broad market trends, according to Dor Laor, the vendor's co-founder and CEO.
"Customers have been asking for vector search to be added to ScyllaDB -- no one wants to run five different databases," he said. "In addition, AI is clearly taking the world by storm, and we have a unique solution for real-time AI at scale."
Based in Sunnyvale, Calif., ScyllaDB provides a NoSQL database platform compatible with Apache Cassandra and Amazon DynamoDB. To date, the vendor has raised just over $100 million in venture capital funding since being founded in 2012. Competitors include Aerospike, Couchbase, Google Cloud Spanner, MongoDB and Redis along with Cassandra and DynamoDB.
Eric Avidon is a senior news writer for Informa TechTarget and a journalist with more than three decades of experience. He covers analytics and data management.