Getty Images/iStockphoto

Tip

Improve observability with AI: 5 real-world success stories

As businesses rely more on hybrid and multi-cloud, comprehensive visibility into system performance and its effect on business outcomes is critical. Observability and AI can help.

Enterprises are dependent on e-commerce and support, whether they are the consumer or provider. Their management systems tend to be bottom-up in their approach to user experience; they manage the state of a technology and not its contribution to the business. However, the real business imperative is quality of experience (QoE) rather than quality of service.

For the cloud and the data center, several separate components are run to operate an application. All the elements of a given UX are usually shared with the load from other users, which can affect customer satisfaction, sales and overall functionality. Obtaining information on this data, then distilling conclusions about the overall QoE, can become impossible.

Traditional management is increasingly unable to respond. However, observability can address this issue, and AI can boost insights into QoE pitfalls. Read about these real-world AI-powered observability use cases and what makes modern observability tools effective.

What is the role of AI in observability?

Hybrid-cloud application deployments and the many elements involved in QoE can be complex. Traditional tools cannot communicate how infrastructure and application conditions can affect QoE remediation efforts. Some experts suggest creating a "digital twin" of the full network and IT environment to organize observability data, but the more popular strategy is to use AI.

Administrators can train AI, from machine learning to large-language models, to correlate between the state of business infrastructure and QoE. It can identify QoE problems across users and applications, and validate the effect of possible remedies to abnormal conditions on QoE.

Observability tool providers, with and without AI augmentation, report a 60%-90% reduction of operations resources applied to UX problem-solving, which can affect sales by 40%-70%. Of 88 companies that offered their observability experiences to Andover Intel, all fell in the lower third of each of these ranges when no AI was used. But among users who augmented observability with AI (13 of 88), their experiences were at the top of these ranges.

Real-world AI-powered observability examples

All 88 companies cited the following improvements to Andover Intel from the baseline of traditional network and infrastructure management systems.

  • Restored QoE. One mass-market retail operation used AI-based observability to reduce the time and human effort required to restore QoE by 84% and reduce the number of incidents where unsatisfactory QoE had to be remediated by over 50%. This is because negative trends could be spotted and addressed before users were impacted.
  • Reduced service outages. An electric utility reduced the number of customer hours of service outages by 63% with AI observability, compared to only 31% without AI.
  • Improved time-to-correct. A healthcare chain with over 500 locations improved time-to-correction on application access problems from 44 minutes to 26 minutes with observability improvements. They are exploring adding AI, and it appears it will reduce the time-to-correct to 17 minutes.
  • Lowered dropped call rates. A B2B product company experienced an 11% drop call rate on customer support because of QoE issues. With AI observability, they reduced this to 4%.
  • Isolated service outages faster. A public utility was experiencing service outages that took, on average, almost half an hour to fully isolate. Observability enabled them to cut that time to less than 20 minutes, and they're hoping adding AI analysis can do better, even making it possible to spot an oncoming problem before it hits.

Are all companies using AI in observability?

Despite these successes, only a minority of enterprises use observability and AI-augmented observability. Hybrid-cloud users in verticals that support online sales and support represent the small majority of adopters, where a third of enterprises use normal observability and 10% use AI-powered observability.

However, nearly half of all related enterprises plan to review AI observability in their next budget cycle, showing that the application of AI within observability is becoming increasingly valuable. The future of AI in observability is still murky, but it could be bright.

What makes modern observability tools effective?

Observability enables stakeholders to completely understand the internal state of the infrastructure that supports an application or service. It relays the actual state of UX and restores proper service as needed.

Many observability tools simply correlate device management information within a single management system and device type, such as networking and servers. While this offers some benefit compared to a pure device-centric view, a "full-stack" offering might replace traditional observability practice. While the former approach requires enhancements to a management system, full-stack observability will require a higher-level management integrator to provide a holistic view of user experience and application usability.

If observability tools are to succeed where individual management tools have not, then they must provide for three things:

  1. Complete understanding. Tools must offer high visibility into what could affect application QoE. Any missing information means reducing its benefits.
  2. Relating state to QoE. Broad visibility can be overwhelming if the data can't be summarized and correlated. Data that is relevant to a given application experience needs to stand out and become actionable. Tools must have the capability to digest and correlate separate sources of information. This is the big difference between observability and management.
  3. Restoration. They must quickly and easily convert issues into remedial action. This is just as important as relating operating state to QoE, but it challenges even observability tools and is responsible for the growing interest in AI.

Tom Nolle is founder and principal analyst at Andover Intel, a consulting and analysis firm that looks at evolving technologies and applications first from the perspective of the buyer and the buyer's needs. By background, Nolle is a programmer, software architect, and manager of software and network products. He has provided consulting services and technology analysis for decades.

Dig Deeper on Cloud infrastructure design and management