This content is part of the Essential Guide: How AIOps monitoring eases modern IT challenges

Getty Images/iStockphoto


5 critical features that put the 'AI' in AIOps tools

Don't fall victim to AI-washing in the IT systems management market. Instead, know what to look for in a truly 'intelligent' operations platform -- starting with these 5 capabilities.

There are many rules-based IT management systems that don't use artificial intelligence, but are marketed as such -- so what distinguishes true AIOps tools from the imitators?

An example can help illustrate the difference:

Many children are warned about fires, and told not touch them to avoid being burned. Those who listen show an ability to follow rules. Those who listen, but decide to personally assess the consequences of touching fire -- and, after being burned, decide never to touch it again -- show a degree of learning.

But what about those who were never cautioned about fire, and then touch it and get burned? They might try one more time, and notice the heat intensifies as they move closer to the flames. They, in turn, observe that the heat becomes less intense as they move away.

These children are building up knowledge, and demonstrating intelligence -- which is how artificial intelligence (AI) must operate within technology to form a true AIOps tool.

AIOps prerequisites

To determine if a purported AIOps tool really has artificial intelligence at its core, look for these five components:

Adaptability to rules. At a base level, rules-based systems provide an efficient way to run apps and handle the majority of IT issues, as long as those issues are easily definable, identifiable and unchanging. On the other hand, some rules are made to be broken. If there are occurrences within an environment that do not stick to the rule, an AIOps tool learns why and then adapts to deal with it. For example, a recent network device might receive a firmware upgrade without any problems, while an older device might have insufficient memory to handle the upgrade. Rather than just raise an exception incident for the older device, an AIOps platform could reroute traffic from that device to another device that's underutilized and up to date. The platform might also place less sensitive traffic, or traffic that's air-locked within one part of the network, onto the older device, where it can reside until the device is retired.

True learning. An AIOps system is able to build up from essentially nothing. Upon installation, the system must fully map all available network components, including all server and storage assets, along with end user devices. This process must be dynamic; an AIOps tool continues to monitor the environment and adapt its overall platform knowledge to adjust its data-flow map as it goes. This process is driven by machine learning (ML), a necessary basis for AI functionality.

Association of patterns to known issues. While this capability can be found in more traditional rules-based systems to protect against issues such as malware and malicious individual attacks, it provides a critical foundation for AIOps tools.

Master AIOps basics

Heuristics. This feature enables an AIOps platform to predict and manage issues that might discretely plague an IT environment. An efficient heuristics engine combines the pattern-matching capabilities listed above with predictive analytics to assign risk profiles to detected events. These risk profiles become triggers for AI events.

AI event engine. With a foundation of rules, machine learning and heuristics, an AIOps tool gives an ops admin insight into how different workloads interact and depend on one another. If the tool identifies an issue -- either present or forthcoming -- it takes steps, independent of human intervention, to ensure that the issue doesn't affect end users.

These steps might include the rollback of a microservice instance to a previously known, working version, or moving a microservice from one part of a platform to another based on resource availability. An AIOps tool might throttle data flows to a certain part of the network, as it continues to investigate the true root cause. Whatever purpose the AIOps tool serves, it must remove the need for human intervention wherever possible, while maintaining a platform's optimal operation and providing the highest availability levels.

Vendors might push AIOps tools that include the first three or four capabilities on this list -- and, for some enterprises, that might suffice. However, as the complexity of hybrid platforms increases, AI capabilities will need to be built into the core of an operations system, rather than bolted on as an afterthought.

Dig Deeper on IT systems management and monitoring

Software Quality
App Architecture
Cloud Computing
Data Center