This content is part of the Essential Guide: How AIOps monitoring eases modern IT challenges

Predictive IT analytics improves distributed application monitoring

Data analysis on app performance isn't an instantaneous process, but it's worth the wait given the time it will save IT admins in problem-solving in the future.

How do you comprehensively track an application in one place, when the app is anything but?

Organizations run applications on multiple platforms -- on premises, in one or more clouds or all of the above -- and use highly distributed architectures, deploying code as microservices and in containers. Compared to monolithic apps, distributed applications decrease monitoring visibility, as well as the effectiveness and accuracy of analytics. Organizations must choose the right tools and set up a distributed application monitoring process.

Concurrently with the trend of distributed applications and computing, more predictive IT analytics tools cropped up to combat issues of reactive IT. This Predictive technology evaluates current stats, trends and historical data and then uses techniques, such as machine learning and data mining, to make suggestions about future or unknown events. While not foolproof, predictive IT analytics that reduce even a few events over an application lifecycle can yield huge cost savings. Predictive monitoring and analysis don't solve every problem, however, because IT organizations still must decide what they monitor and when.

Distributed problems

Modern applications have shifted away from the monolithic, single-server install design to a distributed approach with several moving pieces that reside on even more moving pieces of IT infrastructure. Your organization's approach to overall monitoring can't have a single focus; it must cover a wide net of resources, such as external storage, networking and compute power.

Predictive analytics on a distributed application gets more complicated still. To figure out what application and infrastructure aspects to monitor, start with the top and bottom of the stack, and connect the pieces in the middle. The top end is about the customer experience with the application.

The client-side aspect of the application is critical to the overall performance assessment but is difficult to apply predictive analytics to. Application performance issues -- such as features that often work but then don't -- can occur seemingly at random and are, therefore, difficult to predict. Nevertheless, customer experience is the overall gauge with which to validate analytics information.

Predictive analytics systems' value diminishes if the information can't be used across multiple systems for a complete picture, without which it's challenging to determine the impact of one system's issues on the entire application stack. The days of the siloed application are over, overtaken by interconnected pieces. For operations, this creates a jigsaw puzzle with plenty of missing pieces.

Distributed computing fallacies
Developers creating applications on distributed computing nodes should avoid these eight fallacies.

What to track

It would take a small army of IT operations personnel wielding multiple IT analytics tools across cloud and on-premises systems to monitor distributed applications without any gaps in coverage. Unless you have an unlimited budget for monitoring, it isn't feasible. The key to predictive tools' success is the method used to gather, share and use data, more so than raw machine learning capability or trend recognition.

Correlate the predictive IT analytics to customer experience to ensure that the setup reveals the information IT needs to act upon. With customer experience guiding analytics, the organization's operations team can either prevent or minimize the impact of bugs and failures on customers or, at least, understand what the impact will be and provide workarounds.

Predictive IT analytics limitations

The goal of predictive analytics for distributed application monitoring is to detect and prevent issues, but not every failure or incident is preventable. Analytics cannot occur in real time.

A significant and reasonable concern is the turnaround time of predictive IT analytics. Machine learning and data mining do not go hand in hand with real-time reporting. Both management and engineers must understand that predictive IT analytics systems require time to build sufficient data to process and analyze before the investment pays off with useful insights. Time can vary from hours to days depending on data volume. Admins can reduce the data set, but this endangers the accuracy of the analysis. The goal for predictive IT analytics is issue prevention for better customer experience and IT resource management. Organizations can rely on other methods for immediate alerting and incident response.

Predictive IT analytics can never address every potential random incident that can affect the application stack. Large-scale power outages, cloud vendors going offline and massive hardware failures are events that don't work into average expectations. But the more data your organization has, the better the outcomes will be. The key to success is to understand the technology's limits and benefits and to work within those parameters.

Dig Deeper on IT systems management and monitoring

Software Quality
App Architecture
Cloud Computing
Data Center