How to monitor network traffic in 7 steps What you need to know about intent-based networks

Overview of network management tasks and best practices

Network management encompasses a range of tasks and processes. Explore 10 crucial tasks and accompanying best practices to ensure a resilient and functional network.

As the network goes, so goes the entire infrastructure. Network mismanagement can severely affect all servers, applications and services on which a business relies. That's why it's so important that every network operations management task be taken seriously and held to the highest standards.

In this article, we look at 10 critically important network management tasks and provide tips on how network teams can properly handle them using best practice processes and tools.

1. Network configuration

When networks are properly architected, configuration templates -- sometimes referred to as boilerplates -- are built and updated as needed. The purpose of these templates is twofold. First, they help administrators more quickly configure new devices for deployment. The second benefit is templates help ensure configurations are uniform from one device to the other.

Modern methods for managing network configurations include network automation platforms, as well as software-defined networking technologies that centralize all network configurations within the control plane.

2. Network monitoring and alerting

An important network management task is to closely watch the operational health of an enterprise network to ensure uptime and optimal performance. The use of protocols and health monitoring services, such as Simple Network Management Protocol, syslog, NetFlow and deep packet inspection, can help monitor and automatically trigger alerts when issues arise.

3. Troubleshooting and root cause analysis

An important network management task is to closely watch the operational health of an enterprise network to ensure uptime and optimal performance.

When a network failure or performance problem arises, the network admin is responsible for identifying and remediating the problem as quickly as possible. As part of this process, admins should perform a thorough root cause analysis to pinpoint the true cause of the failure and document what was done to eliminate the threat -- or, at least, reduce the event's effect on the organization. Modern tools, such as AIOps platforms, use machine learning to help automate troubleshooting and root cause analysis processes.

4. Change control management

When admins need to make network changes to a production network, they must closely control the entire process from start to finish. This includes dictating who can make the changes, what time frame the changes should occur, how the changes should be announced and a peer review of the proposed changes.

While network-centric change control management tools are available, most enterprise IT shops opt for a centralized change control platform that all teams can use. These tools are typically part of -- or directly integrated with -- the IT department's service ticketing platform.

Network management tasks

5. Firmware bug and vulnerability patching

While network device firmware isn't patched nearly as often as server OSes and applications, it happens far more frequently today compared to a few years ago. The reason for frequent patching is due to the sheer number of operation bugs and, more importantly, security vulnerabilities.

Admins should put processes in place that enable them to review firmware update notes to verify whether a known bug or vulnerability can significantly affect the business. Based on this research, they should handle firmware patching like any other network change that goes through a thorough change control processes.

6. Configuration backup and secure storage

Many legacy network devices still use command-line interfaces for configuration and management purposes. In the event of a catastrophic hardware failure, it's critical to have a text copy of these configurations that can be pasted into spare or replacement equipment. Policies for storing these valuable configurations should include processes for file encryption and limited access to the backup file repository.

In modern, cloud-managed network architectures, it's often the service provider's responsibility to maintain and protect configuration backups. However, some cloud network service providers permit customers to copy and store their configurations wherever they choose. In these cases, it's important that enterprises store backups outside the provider's cloud in the event of a major service provider outage.

7. Policy and compliance validation

Admins must regularly review all network policies to ensure the network is not only optimized from a performance standpoint, but also from a security, compliance and regulation perspective. Depending on the type of business an organization operates, teams must enforce and regularly review Sarbanes-Oxley Act, Payment Card Industry and HIPAA compliance standards. Network automation tools that also include automated security and compliance verifications can help speed the validation process.

8. Network diagrams

As networks grow in complexity, it's more important than ever to maintain detailed and accurate physical and logical network diagrams. While seasoned network engineers may prefer drawing and updating their own manually created diagrams using tools like Microsoft Visio, many have concluded that their networks are too complex -- and change too frequently -- to keep up. Thus, tools that automatically scan and map the network topology are becoming a popular alternative. While these automated diagrams may not be as visually appealing or include all necessary information, admins can at least be assured they are up to date.

9. Network resiliency

Mission-critical networks are designed and built with high availability in mind. This includes factors like physical cabling redundancy, dynamic routing protocols and spare equipment maintenance in the event of a production hardware failure. Network resilience tasks also include steps to regularly test and evaluate network resiliency response times in the event of a failure.

Another crucial part of verifying network resiliency is to ensure that production hardware and software are properly licensed and are under appropriate levels of support contracts. This includes understanding hardware replacement times, vendor support hours and methods, and detailed steps required to resolve common problems from start to finish.

10. Short- and long-term roadmapping

Lastly, network admins should have processes in place to create short- and long-term network architecture roadmaps. These exercises help to understand where the network is today, what it's capable of in the near term and what the catalyst will be that dictates major upgrades in the future. This network management task requires that administrators read up on, study and receive demonstrations on new and emerging network technologies. Doing so helps admins plan next steps and avoid architecting the network into a corner.

Next Steps

5 needed product features to fix network management issues

Benefits of network automation software ease challenges for IT

Automated network testing tools actively identify network problems

9 most common network issues and how to solve them

What you need to know about intent-based networks

Dig Deeper on Network management and monitoring

Unified Communications
Mobile Computing
Data Center