Warakorn - Fotolia


How to tackle network automation risks and tasks

Automation can make networks more efficient, but network engineers must first mitigate the risks. A good way to start is by setting up simple tasks that deliver operational value.

Many network engineers and network managers are reluctant to deploy network automation. Anyone who has run a network for a reasonable length of time has likely experienced a major network outage. Major outages are stressful and unpleasant, so teams tend to avoid anything that might cause one. If simple changes can cause a major outage, it's reasonable to question why anyone would consider using automation, which can quickly propagate a bad configuration across the entire network.

If automation is the reason for a broken network, teams likely won't consider automation for the answer. Instead, the go-to remediation tool is typically the command-line interface, which teams use to configure one device at a time.

If a team updates 100 devices, with a minute for each configuration, the changes would take over an hour and a half. Multiply that time by the number of minutes the process actually takes and by the number of devices that need correction, and it's easy to see why network teams are reluctant to use automation.

But do network automation risks really outweigh the benefits? And can teams mitigate those risks? To start, let's look at why enterprises need to use network automation and the risks of not adopting it.

Why teams should use network automation

Standardized designs, not snowflakes. Complex network designs -- so-called snowflake designs -- add risk because one part of the network is configured differently than another part. The lack of standards increases the risk of changes in each part of the network. Standardization is important simply because the network deals with fewer -- or no -- special cases. It can better determine failure modes and develop standard procedures to handle them.

Using standardized building blocks for network designs simplifies network automation. Equipment may cost a little more for building-block designs, but the tradeoffs are reduced Opex and greater resilience. By using standard operating procedures for troubleshooting and remediation, teams can more easily understand and mitigate failures.

Building-block network designs are much easier to automate. Automation assistance includes initial configuration, configuration updates, physical connectivity validation and troubleshooting.

Where to start with network automation

Network agility. Network automation has lagged behind compute and storage system automation, and it has to catch up. Companies that delay the adoption of complete IT automation face the risk of losing out to their more agile competition.

Automation means the entire organization's use of IT resources is more efficient. Efficiency translates into productivity and greater profits with the same number of staff. A more stable IT environment means more stability for customers and greater customer satisfaction. In many cases, this can enable higher prices and larger market share.

An agile network can also adapt more easily to new network technologies. The network team only needs to make incremental changes to a few building-block designs and to the associated automation tasks.

Network automation risks

Automation is best adopted by starting with simple tasks.

Adopting automation isn't without its own risks, however. Any ill-prepared and poorly implemented process can break the network -- and automation is no exception.

Here are some points network teams can consider to reduce network automation risks:

Start small and simple. Automation is best adopted by starting with simple tasks. Begin by building some simple scripts that perform basic, read-only troubleshooting or network analysis, such as tracking down a media access control address or finding the root bridge in a spanning tree domain. You should automate investigative or diagnostic tasks that are frequently used and that consume the most time. Don't make any automatic changes at this stage; focus instead on learning the automation tools that provide real value to network operations.

Testing. Network automation needs to adopt the same process that application development uses: extensive testing. Application developers can quickly bring up server virtual machines and client testing VMs and automatically run extensive tests. In contrast, network testing has historically been a problem, because test labs were too expensive and time-consuming to set up.

Building-block designs reduce the number of variations that need to be tested. Vendors are also offering virtual instances of many device types -- frequently at little or no charge, but with limited performance. That makes it important to verify configuration changes on these devices.

The network team may need to work with the rest of IT to create a test environment that accurately reflects the operational network. Ideally, the test environment will include applications and test clients to generate network traffic.

Network validation. Intent-based networking is the latest industry buzz, and you can start IBN by creating a set of basic network checks. Verifying the network state is a great way to reduce automation risks. Verification is also a great tool to validate that your network is functioning as you intended, even before you adopt automated change.

To validate that your network is connected and operating as intended, consider the network state. This includes device interface state, address assignment, neighboring devices, and Layer 2 and Layer 3 protocol information. In this phase, you're not making any changes to the network. The intent-based validation script should create an alert for when a check fails, which enables teams to then take the appropriate action.

The network validation scripts then become tools you can use in a future change process to perform pre- and post-change network validation checks. If any pre-change validation check fails, then abort the change. Similarly, if a post-validation check fails, alert the network staff and potentially back out of the change. Make sure to repeat the pre-change validation after reversing the change to make sure the network returned to the pre-change state.

Making it work

The most important concept with any network change system is to adopt processes that reduce risk. Manual changes use change control boards and review cycles, and these processes will still be necessary. But automation will add additional processes, such as pre-change and post-change automated validation.

If you're just getting started with automation, limit your work to read-only tasks that won't affect the network. Most importantly, get started with network automation now.

Dig Deeper on Network management and monitoring

Unified Communications
Mobile Computing
Data Center