Browse Definitions :
Definition

graph database

A graph database, also referred to as a semantic database, is a software application designed to store, query and modify network graphs. A network graph is a visual construct that consists of nodes and edges. Each node represents an entity (such as a person) and each edge represents a connection or relationship between two nodes. 

Graph databases have been around in some variation for along time. For example, a family tree is a very simple graph database. 

The concept of using databases to map relationships digitally started seeing popular usage in business around 2015 when increased compute power, in-memory computing, and agreed-upon standards moved the concept from academics to real-world uses in business and enterprise computing.

Graph databases are well-suited for analyzing interconnections, which is why there has been a lot of interest in using graph databases to mine data from social media. Graph databases are also useful for working with data in business disciplines that involve complex relationships and dynamic schema, such as supply chain management, identifying the source of an IP telephony issue and creating "customers who bought this also looked at..." recommendation engines.

The concept behind graphing a database is often credited to 18th-century mathematician Leonhard Euler.

The structure of a graph database

Traditionally classified as a type of NoSQL database, graph databases are sometimes referred to as triple stores. That's because this type of database uses a special index that stores information about nodes, edges and the relationship between them in groups of three.

A triple, which may also be referred to as an assertion, has three main fields: a subject, a predicate and an object. Each subject, predicate or object is represented by a unique resource identifier (URI).

How information is indexed

In a triple store, the first field in the database holds the URI for the subject, the second field holds the URI for the predicate and the third field holds a URI for the object.  While there are a number of different strategies that graph databases may use for storing triples, most use an index that abbreviates the three primary fields to {?s, ?p, ?o}. 

For example, if the visual construct for a graph is given as follows:

Nodes and edges

Then the index will look like this:

 Row

?s

?p

?o

1

:Bob

:marriedTo

:Julie

2

:Bob

:brotherOf

:Steve

3

:Bob

:listensTo

:RockMusic

4

:Julie

:listensTo

:RockMusic

5

:Julie

:sisterInLawTo

:Steve

6

:Jim

:worksFor

:IBM

How information in a graph database is queried

Each triple in a graph database only gets stored once in the index. Just like relational databases, it's a simple process to do a straight lookup query in a graph database.

  • If the query is for what information is known about Bob, the indexer programming only needs to search rows 1-3 of the database.

The real power and speed of a graph database comes from indexing combinations of triples.  Here's are a few examples:

  • If the query is for who Bob is married to, the indexer will look for the predicate :marriedTo in rows 1-3 and then retrieve the matching object.  (Bob is married to Julie.) 
  • If the query is to identify everyone who listens to the same kind of music as Bob, the indexer will first ask { :Bob :listensTo ?o } and identify :RockMusic as the object. 

In the second query, the results will return :RockMusic in rows 3 and 4.  The subject in row 3 is Bob himself, so whoever is the subject in row 4 will be the other person who listens to rock music. (It turns out to be Julie, Bob's wife.) 

Types of graph databases

Historically, graph databases have been divided into two categories -- property graphs that simply support nodes and edges, and knowledge graphs like the one above that can focus on the semantic aspects of data and store information in triples. Generally speaking,  indexing strategies for both types are similar.

It is expected that over time, knowledge graphs and property graphs will merge and the architectural distinctions between these two types of graph databases will fade away.

Use cases for graph databases

Current use cases for graph databases include the following:

  • Allow data analysts to federate data sets without having to create and run complex queries that join combinations of tables together, as in the relational database model.
  • Help developers create the back end for voice assistants by mapping possible user questions to correct answers. 
  • Identify clusters of events that are connected in unusual ways to detect fraud.
  • Examine direct connections to identify potential indirect connections for recommendation engines. 

Future of graph databases

Graphs databases are expected to play a major role in areas as diverse as machine learning, Bayesian analysis, data science and artificial intelligence, as well as helping to manage enterprise data and data interchange, over the next decade.

One of the most significant impacts on this type of database will be improvements in data federation. When knowledge graphs can be easily federated, one database will be able to determine that it needs data it doesn’t have and automatically retrieve that data from other knowledge graph. With this ability, it is likely that federation will help developers create blockchains that use relevant metadata to authenticate transactions in banking, finance, voting and smart contracts.

See also:  social graph, graph search

This was last updated in March 2020

Next Steps

Learn how a distributed graph database works

Continue Reading About graph database

Networking
  • remote infrastructure management

    Remote infrastructure management, or RIM, is a comprehensive approach to handling and overseeing an organization's IT ...

  • port address translation (PAT)

    Port address translation (PAT) is a type of network address translation (NAT) that maps a network's private internal IPv4 ...

  • network fabric

    'Network fabric' is a general term used to describe underlying data network infrastructure as a whole.

Security
CIO
  • digital innovation

    Digital innovation is the adoption of modern digital technologies by a business.

  • business goals

    A business goal is an endpoint, accomplishment or target an organization wants to achieve in the short term or long term.

  • vertical SaaS (software as a service)

    Vertical SaaS describes a type of software as a service solution created for a specific industry, such as retail, financial ...

HRSoftware
  • employee onboarding and offboarding

    Employee onboarding involves all the steps needed to get a new employee successfully deployed and productive, while offboarding ...

  • skill-based learning

    Skill-based learning develops students through hands-on practice and real-world application.

  • gamification

    Gamification is a strategy that integrates entertaining and immersive gaming elements into nongame contexts to enhance engagement...

Customer Experience
  • Microsoft Dynamics 365

    Dynamics 365 is a cloud-based portfolio of business applications from Microsoft that are designed to help organizations improve ...

  • Salesforce Commerce Cloud

    Salesforce Commerce Cloud is a cloud-based suite of products that enable e-commerce businesses to set up e-commerce sites, drive ...

  • Salesforce DX

    Salesforce DX, or SFDX, is a set of software development tools that lets developers build, test and ship many kinds of ...

Close