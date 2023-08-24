Neo4j added vector search and vector storage to its core database capabilities to help customers get better results from semantic searches and generative AI applications.

In addition, the vendor aims to reduce AI "hallucinations" -- inaccurate and misleading responses generated by AI -- by adding vector search and storage as core capabilities.

Vector search is a way to search unstructured data, such as text and images, by assigning it a numerical representation to give it structure. Once assigned a numerical value, the unstructured data can be used in semantic searches so users can find similar data using approximate nearest neighbor algorithms and eventually model that similar data to inform decisions.

Keyword searches likewise attempt to discover similar data. Vector searches, however, provide faster and more relevant results, according to Neo4j.

In addition, customers can improve the accuracy of generative AI models and semantic searches by using vectors to index previously unstructured data. Language models and semantic searches tend to favor recent data, and users can apply indexed data that might otherwise be ignored by generative AI models and semantic searches to improve their accuracy.

Based in San Mateo, Calif., Neo4j is a graph database vendor whose platform enables customers to access and use data in different ways that traditional relational databases.

Graph databases simplify connections between data points, enabling them to connect with more than one other data point at a time to more quickly discover and combine data from multiple sources and speed the process of turning data into insights and actions. Relational databases enable data points to connect to just one other data point at a time.

In addition to Neo4j, TigerGraph is a graph database specialist, while tech giants including AWS and Oracle are among others that also offer graph databases.

New capabilities In June, Neo4j unveiled an integration with Google's Vertex AI that enables users to improve their knowledge graphs with generative AI. Through the integration, Neo4j customers can now use natural language to interact with knowledge graphs rather than code, use Vertex AI to transform unstructured data into knowledge graphs, enrich existing knowledge graphs with generative AI and validate responses from large language models (LLMs) to ensure hallucinations don't result in decisions based on bad data. Building on the generative AI capabilities added through its integration with Vertex AI -- as well as through ongoing relationships with OpenAI, Microsoft and AWS -- Neo4j on August 22 added vector search to its core database capabilities. Vector search isn't, on its own, a generative AI capability. But it improves the accuracy of both generative AI models and semantic searches. For that reason, vector search is a significant addition to Neo4j's core capabilities, according to Doug Henschen, an analyst at Constellation Research. "There's clearly a broad sense -- among database vendors and customers alike -- that vector search capabilities should be a feature within the databases that customers are already using to manage their data," he said. "This feature will give Neo4j customers an opportunity to inform and improve the accuracy of semantic search and generative AI capabilities." Neo4j is not alone, however, in adding vector search, Henschen continued. Given that vector search improves the accuracy of semantic searches and generative AI applications -- and that LLMs, such as ChatGPT and Google Bard, sometimes hallucinate and are subject to security risks -- many database vendors are making vector search a core capability. Henschen noted that Alibaba, AWS, Cassandra, Cockroach Labs, DataStax and Dremio are among those that have already added vector search to their database capabilities and that more vendors have vector search capabilities in development. "In announcing this feature, Neo4j is joining a rapidly expanding group of database and data platform companies that have recently made, or are about to make, vector search-related announcements," he said. One of the key decisions Neo4j had to make was whether to make vector search and storage core capabilities of its existing database or develop a new database specializing in vector search and storage, according to Sudhir Hasbe, the vendor's chief product officer. After recognizing the growing interest in generative AI, Neo4j consulted about a dozen of its major customers and canvassed them for input on how to go about incorporating generative AI. Customers told the vendor they wanted to use natural language to ask questions of their knowledge graphs, Hasbe said. They wanted Neo4j to be agnostic in terms of integrating with generative AI vendors. They wanted their generative AI models to have the long-term memories provided by vector storage rather than be trained using only recent data. And they wanted it all in one place. "We had to ask whether a vector database should be a different category or whether it should be a feature of an existing database," Hasbe said. "Based on feedback, it made sense to make vector search a feature of our database wherein you can take explicit relationships and implicit similarities and combine them for a single use case. Keeping different environments didn't seem like the right solution." An organization's datasets are displayed in a graph database from Neo4j.