As generative AI finds its way into more and more IT tools, early applications for IT ops have begun to emerge, including a potentially major step forward for AIOps.
Vendors that already used AI and machine learning to automate IT operations, otherwise known as AIOps, have embraced generative AI, which can produce new material such as text and code based on the analysis of existing data sets. A further subset of generative AI, conversational AI, can deliver the results of that analysis in human language via chatbots or virtual assistants. Both advances in AI have taken the tech industry by storm in the last year following the introduction of OpenAI's ChatGPT.
Now users of generative AI features for AIOps tools anticipate significant benefits as they put them into production, such as the ability to proactively generate root cause analysis for incident response, in the case of BigPanda's recently updated software. Elsewhere, Glean Technologies' Workplace Search has begun delivering search results on swaths of company data for developers without human intervention from a devx team, potentially yielding significant cost savings.
AIOps tools have weathered their own hype cycle and growing pains since their introduction into the mainstream in 2018. Their relative stability provides a strong foundation for experimenting with generative AI, said Alvin Smith, vice president of global infrastructure and operations at InterContinental Hotels Group (IHG), a multinational hospitality company headquartered in the U.K.
"One of the things that BigPanda does well is data enrichment. We've got all of these different systems providing us monitoring and alerting information," Smith said. "Being able to pull that in and having it further enriched within BigPanda is going to increase its accuracy [for generative AI]."
IHG builds on improved MTTR with BigPanda
IHG, which operates more than 6,000 hotels globally and employs more than 300,000 people, first brought in BigPanda's AIOps product in 2017 to reduce the number of alerts its IT staff was getting from multiple IT monitoring and observability tools, including Dynatrace, which was eventually replaced with AppDynamics and others. BigPanda's event correlation and alert reduction is also connected to IHG's ServiceNow ticketing system to start incident response workflows for its incident managed service provider.
The company still uses AppDynamics dashboards for its IT command center employees to get live views of IT infrastructure as they triage incidents. But BigPanda has reduced the number of tickets created and improved the company's overall mean time to resolve incidents. As a result, the company reached 99.8% reliability for its systems in 2022, Smith said.
"We're able to see [problems] faster; we're able to pull in teams faster. And that gives us more time to resolve [incidents]," he said. "We can use the event correlation and weed out the noise to focus on what's actually happening in our environment."
IHG is about to complete proof of concept testing on BigPanda's Generative AI for Automated Incident Analysis, released in July. The update uses a large language model to compare data about ongoing incidents with historical incident responses, proactively pointing to the potential root cause of issues automatically in minutes or seconds rather than retroactively determining that as part of a post-incident review.
Root Cause Changes, another new feature, uses machine learning to correlate individual changes tracked in systems such as ServiceNow's Configuration Management Database and Atlassian's Jira with incident data and suggest what changes are most likely the root cause of incidents.
"We're looking for generative AI and AIOps to say, 'OK, you've had this happen in the past, and eight times out of 10, here was your root cause,'" Smith said. "We're hoping to get to that path of recovery much faster."
IHG has already automated the remediation of some routine issues, such as file systems running out of disk space. It remains to be seen whether generative AI will finally lead to auto-remediation for more complex issues -- the ultimate goal of AIOps tools. But Smith said he's hopeful that it will.
"It can look in other areas outside of our typical monitoring. We can work with our application teams and look at some of the data that they're leveraging and add BigPanda's enrichment," including application logging and infrastructure logging tools from Sumo Logic and LogicMonitor, he said.
Most of these other vendors, including AppDynamics under Cisco's Full-Stack Observability suite and LogicMonitor through its 2021 acquisition of Dexda, claim to offer a similar centralized clearinghouse for AIOps data management. But BigPanda is the one that has proven itself in IHG's environment already with its emphasis on data quality, Smith said -- an issue that plagued early adopters of AIOps tools. BigPanda launched a Data Engineering service in February that automatically grooms event data for customers when they first feed it into the BigPanda AIOps back end.
"It's the trust factor," Smith said. "We brought BigPanda on specifically for correlation, and they do that really well. … Others are talking about it; they're all talking about generative AI around it. But you go with what you know and what you're comfortable with."
Cruise projects $1 million in savings with Glean chatbots
Glean Workplace Search, marketed by Glean Technologies Inc., isn't a typical AIOps tool. The company's main focus is on enterprise search that replaces clunkier corporate intranet websites, rather than IT automation. That and Glean's ability to serve up search results while preserving corporate access management policies were what first appealed to autonomous vehicle company Cruise when it bought the tool in 2022.
But new generative AI features, including chatbots, that were added by Glean in April will make a major difference in the efficiency of on-call support at Cruise, according to David Cooke, manager of corporate engineering at the company, which is based in San Francisco.
The devx team developed a Slack chatbot to respond to basic questions from developers about company policies and documents. But often, a human from the devx team would have to respond to more complex questions.
Alvin SmithVice president, global infrastructure and operations, IHG
"The Slackbot was developed to answer common questions and spit out the default answer that's provided for commonly requested tasks," Cooke said. "Glean Chat has turned that into an interactive chatbot, backed by the data Glean is indexing -- on our wiki, in GitHub repos, in Jira tickets, whatever it can find -- that has relevant information [and] spits back out a human-friendly answer."
Cooke estimated the Glean chatbot alone will save the devx team 115 hours per week in on-call support time.
"So just this use case was [worth] about a million dollars a year, from just this bot," he said.
As an enterprise search tool, Glean isn't purpose-built for IT ops or DevOps the way tools such as BigPanda, GitHub Copilot Chat and Atlassian's Confluence are, but that might make it a more versatile tool that can serve business stakeholders as well as the development team, Cooke said.
"I'm not necessarily just focused on DevOps but serving the entire enterprise," he said. "My team is looking to integrate a type of troubleshooting support layer on top of all our applications. … You would have this tool in the application you're interfacing with to walk you through the issue you're running into in the moment."
Cooke said he's looking forward to the development of APIs and other utilities on the Glean roadmap as part of the Glean Platform that will make that kind of application integration easier. One Glean Platform item, named the Tools API, would allow the Cruise devx team to use GPT-4 to take certain actions in response to events, such as opening a Jira ticket. The Tools API is still under development, while other components of the platform are in closed beta with a limited number of customers, according to a Glean spokesperson.
"[That would mean] a productivity boost where you won't have to context-switch between multiple applications," Cooke said. "You can just run it through this single interface to produce multiple actions in the connected platforms."
Beth Pariseau, senior news writer at TechTarget, is an award-winning veteran of IT journalism. She can be reached at [email protected] or on Twitter @PariseauTT.