Splunk preps OpenLLMetry tie-ins for deeper AI monitoring
Detailed visibility into internal communications will be essential to enterprise trust in AI agents, and something Splunk and the OpenTelemetry project intend to offer.
BOSTON -- Updates to Splunk Observability Cloud's OpenTelemetry underpinnings scheduled for early next year will give enterprise IT pros a closer look at AI apps' inner workings.
While Splunk's upcoming feature updates also include AI agents that automate troubleshooting tasks, it's more crucial for some large companies to peek behind the scenes at the agents themselves first. Specifically, IT organizations responsible for distinguishing AI agent hype from reality will need to closely monitor interactions among agents, large language models (LLMs), Model Context Protocol (MCP) servers, data repositories and human users.
"What context are agents sharing with each other when you don't have a human in the loop?" said Jonathan Moore, domain architect at a Fortune 100 company in the Midwest, during an interview at Splunk's .conf25 this week. "As commercial products evolve to be AI-centric, what are vendors doing to develop the AI agent user journey? How are agents communicating and passing responsible AI policies and security controls [to each other]?"
Moore said he has been following the Cloud Native Computing Foundation (CNCF)'s OpenLLMetry project, an offshoot of OpenTelemetry that's developing AI monitoring tools for that purpose. OpenTelemetry established a working group earlier this year to devise standards for AI monitoring naming conventions, and OpenLLMetry is already present in some observability vendors' tools, such as Dynatrace and Datadog.
Splunk's OLLMetry roadmap
Splunk is a top contributor and maintainer of OpenTelemetry, and parent company Cisco also integrated the observability standard into the AppDynamics observability tool, which is now part of the Splunk Observability Cloud. According to Morgan McLean, one of the co-founders of OpenTelemetry and director of product management for Splunk's Observability Cloud, some 40 full-time engineers are dedicated to OpenTelemetry at Splunk.
"OpenLLMetry was a side project, and it's getting merged into OpenTelemetry," McLean said in an interview this week. "It's monitoring all of these, typically, LLMs, but theoretically any type of AI [entities] that are working together, as well as the data sources, whether those are MCP servers, API calls, storage resources or anything else that they're calling into."
Greg Leffler, director of developer evangelism at Splunk, said during a joint interview with McLean that that data and some evaluation mechanisms for assessing the quality of those communications will be incorporated into the next release of Splunk Observability Cloud. According to company officials at the conference, most of Splunk's product updates previewed this week are expected to become available in February.
"This is all really just an extension of our existing APM [application performance monitoring] product, with some specialized user experiences that are being developed," he said. "It also applies to things that will be revealed in the existing topology map and other places you'd expect them to be."
These will include Splunk's digital experience monitoring product, so users can see human interactions with AI agents as well, McLean said.
Splunk .conf25 attendees heard about Splunk's plans for AI agent monitoring and observability during a keynote presentation this week.
OpenTelemetry takes aim at instrumentation toil
Moore said AI agents show long-term promise to eliminate some of the onerous tasks associated with being on call, such as 2 a.m. wake-up calls for a system hiccup that could be automatically resolved. But there are less complex uses for AI in observability that would be more immediately productive, such as ensuring that application code is properly instrumented.
What context are agents sharing with each other when you don't have a human in the loop? As commercial products evolve to be AI-centric, what are vendors doing to develop the AI agent user journey?
Jonathan MooreDomain architect, Fortune 100 company
"I would love to have an agent that could go through my codebase and understand where I've missed instrumenting it," he said. "When it goes through a pipeline, it would be nice if there is a step in it that calls an OpenTelemetry agent for me and then evaluates my code and makes sure that I'm following whatever schema I need."
McLean confirmed that such a feature is on the Splunk Observability Cloud roadmap and will be available for the larger OpenTelemetry community, though he did not specify when it might be released. Discussions within the community about how to properly use AI for instrumentation have been ongoing this year, with some debate about the best implementation approach.
"That's not just from Splunk," McLean said. "In the OpenTelemetry community, we've been doing a lot of work where we're overhauling our documentation and our code so that all of the popular LLMs can be really easily trained on OpenTelemetry."
Beth Pariseau, a senior news writer for Informa TechTarget, is an award-winning veteran of IT journalism covering DevOps. Have a tip? Email her or reach out @PariseauTT.