WavebreakMediaMicro - Fotolia
The use of automation to reduce IT operations costs and errors has been popular for years. But as enterprises attempt to achieve zero-touch operations, many discover that a lot of manual work remains.
The move toward zero-touch IT operations
Traditional IT operations have long been based on an operations center -- sometimes called a network operations center, or NOC, when the focus is on networking. In these centers, a professional staff uses resource monitoring and management tools to remedy IT issues. A human expert typically serves as the bridge between an issue identified through monitoring, and a response facilitated by a simple tool or operations script.
Operations is based on the concept of lifecycle management. An application, database, server farm and network all have a preferred operating state. Capacity planning defines the resources necessary to support that state, and lifecycle management ensures newly deployed applications achieve and maintain their preferred mode of operation -- also called a goal-state. With fully automated operations, a central control room where workers field problems is no longer necessary, as the automated tools create direct feedback from event to event-handling. This model is known as zero-touch IT operations.
Applications made up of multiple components, as well as networks based on a mixture of devices and equipment, have increased IT complexity. Virtualization and the cloud, combined with these multi-component applications, make app deployment a complicated ballet of coordinated steps, including parameterization and resource assignments. DevOps tools and container orchestration systems such as Kubernetes aim to ease these deployment challenges. Sometimes these tools work to automate operations and reduce human effort, costs and errors -- but sometimes, they don't.
Three challenges with a zero-touch model
What frustrates CIOs is that their zero-touch transformation projects are expensive. This could be why -- according to a CIMI Corp survey -- over three-quarters of all large enterprises say they're evaluating zero-touch models, but only a small fraction has pulled the trigger on projects. In 2019, in fact, the number of enterprises that adopted zero-touch IT operations was smaller than the number who cancelled their projects, according to the survey.
The first, and perhaps most significant, problem users report with a zero-touch model is that the software to support it doesn't actually automate the entire lifecycle. The complaint is that while these tools automate deployment and redeployment processes, as well as detect abnormal operating conditions through monitoring, they don't properly close the loop from alarm to response and remedy.
This is a challenge because human intervention within an automated process lacks critical context. Let's say the operations expert, for example, finds that a certain component has failed, but never committed that component in the first place. The operator won't have a sense of what the component did and, therefore, what its failure means. There's a getting-up-to-speed issue here -- and the more an automated toolkit does before it throws the unfinished task to the operator, the more work the operator has to do to understand the problem and launch a response.
The second challenge users cite with zero-touch IT operations is the classic fault avalanche. VMs, virtual networks, microservices and other high-level technology run on and depend on something at a lower level. If something fails at that underlying level, then everything above it fails, too. Depending on how IT teams monitor, correlate and report on faults, an operations expert can end up with a thousand reported problems when really there's only one.
A third problem is that automated operations tools, especially suites of loosely coupled elements, don't hand off an entire problem to human overseers, but rather just the steps the tools themselves can't cover. Human intervention then often collides with automated responses to issues, which creates a conflict of remedies. This problem is particularly acute when data center operations, network operations and application lifecycle management reside in different places.
All of these challenges, according to users, can lead to IT horror stories, including total data center failures.
Advice to go zero-touch
Enterprises should ease into zero-touch automation. Never expect to eliminate operations centers completely, but to make them more productive. Aim to improve routine tasks and define policies to handle non-complicated issues through automation. While zero-touch tools will reduce manual efforts, they'll still need to be monitored by human operators.