AI, natural language processing and semantic search applications are a driving force behind emerging data management operations. Data teams can use knowledge graphs in conjunction with these tools to derive insights traditional databases might not provide.
A knowledge graph helps organize context, connections and ontologies into a data management system. It also models the relationships between real-world entities to improve the accuracy of and insights into various data processing tasks.
A knowledge graph often works on top of a graph database to describe the ontology of the information in the graph database. This combination provides a richer way to characterize what the underlying data means for AI applications rather than using traditional databases, such as relational databases. Knowledge graph-enhanced databases can also be easier for business users to work with and understand compared to traditional databases.
Knowledge graphs vs. databases
The key distinction between a knowledge graph and a traditional database is how each one stores data, said Gregor Stühler, CEO at Scoutbee, a procurement automation platform.
A traditional database stores data in tables with predefined schemas. It organizes data in rows and columns, and establishes relationships between entities using primary and foreign keys.
Gregor StühlerCEO, Scoutbee
"While traditional databases are efficient at storing structured data and handling basic queries, they can struggle to capture complex relationships and infer new knowledge from data," Stühler said.
A knowledge graph is a network of interconnected entities and their relationships, represented as nodes and edges. Information is organized to model real-world objects and their relationships, thus imparting that information to machines that consume it.
Knowledge graphs and graph databases
Knowledge graphs work in conjunction with a graph database. On its own, a graph database maps relationships between data sets, much like someone sketches a system on a whiteboard. The knowledge graph sits on top of this database to represent complex real-world entities and illustrates the relationship between them. The combination of a graph database and a knowledge graph helps nontechnical users visualize and analyze the data they need.
Think of knowledge graphs as a type of knowledge base, said Gabriel Montagne, senior product manager for the machine learning platform at Coveo, an enterprise search platform. The knowledge graph uses a graph-structured data model to integrate data and store interlinked descriptions of entities, events, situations or abstract concepts. It also encodes the semantics underlying the terminology. Knowledge graphs include an ontology that enables humans and machines to understand and reason about its contents.
A graph database uses a graph structure for semantic queries about the nodes, edges and properties used to represent and store data. However, the graph database does not typically include an ontology. As a result, more work is required to store and reason about the complex knowledge representations found in a knowledge graph. Grouping knowledge graphs together can form a knowledge graph database.
Deriving knowledge in real-world scenarios
Cropin, which offers an agricultural management platform, has worked with knowledge graph databases to improve its AI workflows. While most of the information is statistical or textual, said Praveen Pankajakshan, vice president of data science and AI, the company is increasingly exploring ways to derive knowledge from images and scenes. Cropin must manage these data sources to train better AI algorithms.
Pankajakshan's team is working on a crop knowledge graph that can automatically transform raw imagery into organized knowledge of more than 500 crops and 10,000 crop varieties. This process transforms information buried in data into a linked format and stores it in a knowledge graph in a machine-ready format. Tools and platforms can ingest the data and provide insights utilizing information on geographies, climate conditions and soil types, cultivation lifecycle and other factors.
For example, subtle color changes mean something different with corn plants versus soybeans. With the knowledge graph, Cropin can feed the meaning of these changes into various AI algorithms. As a result, the company can advise farmers on optimal watering, fertilizer and pest control interventions.
The company can also combine the information in the knowledge graphs with real-time data to help farmers understand problems and make better decisions about cultivation practices and land management.
Advantages of a knowledge graph
Graph technology is invaluable for storing and visualizing data with complex relationship structures, Stühler said. Knowledge graphs make it easier to factor in new data points than a traditional database. For example, his team is working on applications to map risks across a supply chain spanning multiple countries. Data tables are not practical for such use cases, whereas graphs enable advanced analytics or machine learning.
Graph technology helps organize the data and connections, so they are ready to use. No extra work is required to compute or map anything when the user needs to pull in data. To look at supply chain risk, a risk node is added that relates to a specific city node in the knowledge graph. In contrast, tables generally make more sense for static data that is not complex or does not have relationships with other data points.
Knowledge graphs can also connect data points about internal customers, suppliers and third parties. Data scientists can then run algorithms to analyze relationships and draw conclusions.
Large language models (LLMs), which understand and summarize content, while also creating and predicting new content, add immense value, Stühler said. LLM front ends improve interaction, while the knowledge graphs enable semantic search of the data based on the interaction with other LLMs.
Use cases for knowledge graphs
Data teams can use several indicators to assess when knowledge graphs are the best fit, said Ryan Oattes, cofounder and CTO at Kobai, a decision intelligence platform. Knowledge graphs are best suited for storing and visualizing complex, interrelated data that can be difficult to represent in traditional databases, according to Coveo's Montagne.
Examples of information that are a good fit for knowledge graphs include the following:
- Biomedical data models of complex interactions among genes, proteins and diseases, allowing researchers to identify potential drug targets and develop new treatments.
- Financial data, such as stock prices, market trends and investment portfolios, to analyze market trends and make investment decisions based on a wide range of data sources.
- Social network data, such as user profiles, connections and interests, to personalize content and recommendations based on user interests and connections.
- Product data, such as features, specifications and reviews, to manage product development and ensure consistency across multiple channels and platforms.
- A high degree of interconnectedness between information, such as the complex relationship between maintenance work orders, machinery in a production line or aircraft and the spare parts needed to facilitate the work.
- Organized hierarchies of information to track the performance of parts, systems or manufacturing processes.
How to implement the technology
Knowledge graphs require new workflows to get the best results. Domain experts can help get it started.
At its best, a knowledge graph contains terminology and structure that reflect peoples' understanding of a given domain, not something derived from data stores where data may originate," Oattes said. This allows maximum collaboration and reuse, two of the biggest opportunities to get value from a knowledge graph.
The schema must describe the ecosystem to ensure it's a good reflection of reality.
"Knowledge graphs live and die by the strength of their ontology," Stühler said.
Also, consider how LLMs can help build the ontology. LLMs can help organizations understand how schemas and topics are structured and describe the ecosystem in a meaningful way.
LLMs are also helpful to catch duplicate nodes that can occur in a graph database. These models can manage, structure and improve knowledge graphs.
"LLMs will eventually replace the way we store and interact with data. But from an aggregation, reflection and description perspective, knowledge graphs are here to stay," Stühler said.