If you want to understand relationships, graph analytics is the way to go. While graph databases and graph theory aren't new, graph analytics is finally ready for prime time.
"We needed a truly elastic compute environment in order for graph [analytics] to really work," said Mark Beyer, distinguished vice president analyst at Gartner. "And we needed an elastic compute environment to figure out what meaningful boundaries were needed in a graph."
Before cloud computing was available, it was difficult to determine how small or large a graph should be outside of laboratory environments. But now that this technology is easier, using graph analytics for big data has become more popular.
What is graph analytics?
Graph analytics uses algorithms to explore the relationships among entries in a graph database, including connections among different people, transactions or organizations. Use cases include contact tracing, cybersecurity, drug interaction, recommendation engines, social networks and supply chains.
Data scientist and mathematician Adrian Zidaritz said that from a competitive standpoint, businesses can't afford to choose only a subset of the many relationships among various data points and squeeze them into relational tables.
"Doing analytics on this graph data requires an adaptation of current deep learning algorithms to take advantage of the graph structure instead of the flat geometry of the relational tables," Zidaritz said. "The term geometric deep learning -- a rarity a few years back -- is now seeing increased usage."
Graphs can be transformed into vectors and analyzed with linear techniques, like text analysis.
"Both words in a text and nodes in a graph are powerfully defined by their context," Zidaritz said. "[As J.R. Firth said,] 'You know a word by the company it keeps.' Similarly, you know a node by the company it keeps."
Graph analytics in action
Ramesh Hariharan, CTO and head of data services at LatentView Analytics, said using graph analytics for big data enables faster decision-making, including automated decisions.
"Recommendation engines are a classic application of graph analytics," Hariharan said. "The other thing is product trend forecasting. [Consumers] talk a lot about trends [such as] health and relationship trends. Companies would like to know which of these trends are important versus which are fads."
Mark BeyerDistinguished vice president analyst, Gartner
An obvious use case is identifying social media influencers and which messages are going viral. In fact, graph analytics can analyze all kinds of networks.
"If you have data in a relational database and you want to find out how many people are above me in an organization, it's very difficult to write an SQL query that can do that. It's much easier to do with graphs," Hariharan said. "Graphs help us easily discover relationships so we can try to understand [their] attributes."
Mike Chrzanowski, business intelligence expert at spreadsheet consultancy Senacea, said his team used graphs to determine what was causing faulty outputs. It turned out resource matching was the problem. By avoiding risky pairings, they were able to quickly fix the production process.
"The visual representation of the problem helped us to cut through the informational noise and focus on finding a lean and efficient solution," Chrzanowski said. "Graphs can help highlight the important relationships in vast data sets with complex relationships between elements and millions of existing causality pathways. It can be a stepping-stone on the journey from data to information that influences decision-making."
Working with graphs can sometimes be challenging, however. For example, some social media influencers have a disproportionate number of connections.
"Barack Obama's Twitter account has 127.6 million followers, which is the most on the platform," Hariharan said. "A graph analysis exploring his connections would be challenging to compute and, perhaps, even more difficult to analyze due to sheer volume."
Expect data management to evolve
According to Gartner, the future of data management started in 2019, and it's relevant to graph analytics. One of Gartner's presentations created in 2019 posed some interesting questions such as what if we have enough data? And what if we have so much data that the data is trying to find itself?
"If I see correlations in data sets, and those correlations keep occurring over and over again, it's probably not a correlation anymore. There's probably some level of causality going on," Beyer said. "Once you have that much data, you can start to figure out all the triples of how data goes together, correlation, causal, truly enforced integrity -- all that stuff. You can really take advantage of an elastic environment."
Beyer also said a fundamental shift is happening from a paradigm of expectations to a paradigm of experiences.
"Traditional metadata and traditional analytics were designed to behave in a certain way," Beyer said. "We were always focused on forcing things into working according to expectations, identifying outliers and errors, forcing the errors to be corrected and sectioning off the outliers to be managed separately. When you think about the [COVID-19] pandemic and disease management, all of the assumptions about how we were going to do things were no longer true, so we had to move to an experience response model."
Gartner published a special report about how to use graph analytics for big data to solve pandemic problems. Disease management requires putting patients in cohorts, such as people who have underlying conditions or people who take certain medications. However, there are other considerations like supply chain disruptions, which could affect the availability of ventilators or pharmaceuticals.
"Now, we have to analyze the experience, and to analyze the experience, you need to analyze all of the connections," Beyer said.