Modern IT environments are distributed, with applications deployed across private data centers, multiple public clouds and edge locations. Simultaneously, hybrid work has taken hold, with the majority of employees accessing those applications from a combination of corporate locations, home offices and while traveling. As a result of applications and users being deployed in a highly distributed environment, the network has become an integral part of ensuring positive experiences.
Yet the highly distributed nature and lack of observability in the end-to-end network results in significant complexity that impedes operations, impacts performance and delivers less than optimized experiences. In fact, the majority of respondents (54%), according to the 2021 Enterprise Strategy Group (ESG) survey "Network Modernization in Highly Distributed Environments," cited their networks were either more complex or significantly more complex than two years ago.
The clear reality is that these network environments are rapidly becoming too complex to manage via legacy, manual methods. At the same time, organizations that need to remain competitive in an always-on digital economy need to ensure the network is always available and fully optimized. This would mandate that network operations teams need to minimize reactive troubleshooting and remediation times.
Compounding these issues is the fact that modern networking equipment generates more information than ever before. But trying to collect all that data, then correlate and interpret it in a large-scale network, is a Herculean task that has exceeded human capability. Plus, innovation in network technology is happening at a faster pace than ever before (especially for cloud-native products). It is difficult for existing employees to stay up to date and even harder to find new staff with the appropriate skill sets. This is not to say that highly experienced network operators can't do it, but it requires individual acts of IT heroism, and then all of that institutional knowledge is lost when the skilled employee leaves the company. So even if the budget were available to do it, adding more skilled resources just to keep pace with technology and maintain the networks' operational status is not a sustainable proposition.
Enter AIOps for network operations
Organizations need to leverage technology to become more operationally efficient and augment existing network resources. This is where AI/ML technology or AIOps comes in to help automate problem detection, identification and -- in certain cases -- remediation. In order for this technology to be effective, however, it needs to learn from data -- lots of data. To do this, network vendors need to be able to collect all available networking data, not just from your unique network, but from all their customers' networks (in a secure, anonymous fashion, of course), so they can take the collective knowledge from all the network environments they support and use it to build the algorithms and models that, in turn, will help every customer.
One of the critical steps to enabling these repositories of network information has been the adoption of cloud-based network management products. By moving all network data to the cloud, a massive data set can be leveraged to understand as many scenarios as possible and create intelligence that accurately determines root causes of problems. It can even provide recommendations to fix them. The good news is that cloud-based network vendors emerged about a decade ago, and now most network vendors have deployed cloud-based management options that cover increasingly more network domains (Wi-Fi, wired, WAN and data center networks). So, it would make sense that network vendors either have or are actively creating a sizable database to work with. By collecting data across the entire end-to-end network environment, these intelligent products can provide context and highly accurate information.
While the term AIOps is not exclusive to the network domain, it is becoming an important tool for managing and automating modern networks more efficiently. ESG's "2022 Spending Intentions Survey" highlighted that the top goal for digital transformation projects is to drive more operational efficiency. Network operations teams can use these intelligent products to operate more efficiently -- even as network environments become more distributed and complex. AIOps products, or whatever term network vendors use to describe their intelligent AI/ML products, will be critical for organizations to accelerate the adoption of new technology and ensure highly available environments. Well-implemented technology will learn and continually improve as it consumes more data. It will also be instrumental in enabling network operations teams to transition from being in a reactive response mode to becoming more proactive and predictive -- often referred to as "self-healing" and "self-optimizing" -- to ensure optimized availability and performance. This might be a good time to interject that network operators will play an important role in the development of this technology. As the technology is learning -- sending alerts about problems or maybe even recommending solutions -- it will be imperative to have a feedback loop for the operations team to either validate the alert or recommendation, or, if it isn't correct, respond with the correct root cause or proper recommendation. Because technology is always evolving, the network operations team will always be needed to assist in the development of this network intelligence.
The important takeaway for network operations teams, is that while this network intelligence is still relatively new, it is here to stay. Organizations need to embrace these technologies and work collaboratively with their network vendors to ensure it is accurate and delivers value, so operations teams can focus on deploying the next generation of network equipment without being overly burdened by trying to keep the last generation available. In addition to troubleshooting and optimization, these technologies should also help with asset identification, lifecycle management and potentially even providing benchmarking against like-size environments.
Bob LalibertePrincipal analyst, ESG
AIOps steps for network operations teams
Like any new technology deployed in an enterprise environment, it will take time for operations teams to become comfortable with it. The teams need time to validate that what the product is reporting is accurate and trust the results it generates. Because it will take time, organizations should start working with it now and take the following steps:
- Network operations teams should start by leveraging the alerting function and providing feedback.
- Once comfortable with that capability, turn on the recommendation engine to see if it accelerates troubleshooting.
- Once you have become comfortable with certain recommendations, it might be time to automate the remediation function where possible.
- One important feature that should be included in all AIOps products is a feedback loop -- something to tell you that "x" event happened and that the software applied "y" fix, then validated that the network was back to normal operations. By sending you an alert documenting all of this, the feature will enable even the most conservative network operations teams to become more comfortable with the use of AIOps eventually.
ESG's "Network Modernization in Highly Distributed Environments" research indicates the majority of organizations (59%) are now leveraging the recommendation engines, but uptake on full automation is still limited to about one-fifth of survey respondents (21%).
The good news is that these products are becoming more prevalent in the networking space, with most vendors providing some level of intelligence for at least part of the network. Ideally these AIOps or intelligent systems would cover the full end-to-end network environment to fully comprehend the complete environment and provide context, but even deploying a product in just one network domain can provide significant advantages for an organization.
Virtually all network vendors I have talked to now have an AI/ML, AIOps or intelligent product at some level of maturity. Some of those vendors include Arista, Aruba, Cisco, Extreme, IBM, Juniper and VMware. The bottom line is, if you haven't started to work with an intelligent AI/ML-powered or AIOps product, now is the time to get started, so you and your business don't fall behind.
ESG is a division of TechTarget.