Getty Images/iStockphoto
Free network disaster recovery plan template and guide
Proper network disaster recovery planning can mean the difference between a brief disruption and extensive downtime. Download our free template to get started building a plan.
Ensuring viability and uninterrupted service of network resources is a high priority for any business. Protect those valuable investments from unplanned disruptions with a network disaster recovery plan.
Most organizations depend on voice and data communications and have LANs and WANs. Network services are typically provided by multiple vendors, such as local telephone companies, WAN carriers, ISPs, wireless carriers and managed network service providers. Disaster recovery plans for networks are essential to businesses. A solid network DR plan helps ensure critical network services can be recovered in a disruption.
Developing a network disaster recovery plan requires the organization to know details about the procedures needed to recover and restart network resources. It also requires procedures for coordinating with outside resources. Plan complexity varies between organizations depending on size, industry and the technologies used in the network.
This guide and its associated network disaster recovery plan template will examine the issues that organizations should address when preparing and deploying a network disaster recovery plan.
Examples of network service interruptions
Network interruptions can result from many situations. The following is a list of such events:
- Carrier infrastructure equipment failures.
- Carrier software failures.
- Breaks in both overhead and buried cables.
- Carrier switching center disasters.
- Damage to telephone poles from vehicular accidents.
- Damage to undersea cables.
- Loss of communications satellites.
- Damage to wireless carrier towers and infrastructure.
- Loss of internal LAN components.
- Failure of network components.
- Loss of VoIP technology.
- Loss of the internet.
- Cybersecurity attacks.
- Phishing and DDoS attacks.
- Rogue employees who damage network resources.
Securing network perimeters and overall infrastructure from unauthorized access, viruses, ransomware or attacks is typically handled by cybersecurity plans, which can be complemented by network disaster recovery plans.
Make network disaster recovery planning a priority
When it comes to network infrastructure, disaster recovery planning might not be a huge priority, especially considering the threats from cyberattacks, viruses, phishing attacks and denial-of-service attacks. As such, network security is usually a high priority because a porous perimeter spells doom for most organizations. Preventing damage from cyberattacks gets management's attention today.
On the voice side, continued deployment of VoIP technology has increased the importance of robust network security initiatives. VoIP is simply another application using existing network resources, so it has vulnerabilities that businesses must address the same as other network-based systems.
In the past, private branch exchange systems typically used separate network facilities and did not overlap with data networks. However, as it became more cost-effective to share voice and data traffic over digital T1 and ISDN access lines, the risks to voice communications systems increased. These systems are complicated by integrated networks that support voice, data, text, video and social media.
Most network activity today is digital, and the variety and complexity of applications using digital networking underscores the need for network disaster recovery and resilience.
Why networking requires its own DR plan
Network disaster recovery planning is critical for enterprise WANs and LANs, no matter how large the organization or potential disaster. A network-specific DR plan is necessary for the same reasons that cybersecurity planning has become such a high priority.
Today, organizations have voice, data, wireless, internet access and other network services sharing the same network resources. It is essential to not only protect network access lines and the interface devices that support these services, but to be able to quickly recover and restart those network resources following a disruption.
A resilient network environment is also indispensable for providing business continuity. This ensures the viability of local and remote network access, physical or virtual servers, and storage resources. This is important regardless of location, which could be a central data center, colocation facility, MSP, ISP or the cloud. Network resilience is essential for branch and remote offices that require access to company resources.
Getting started with network DR planning
Before starting the process of creating a network disaster recovery plan, consider these important points:
- Take the disaster recovery planning process seriously. A plan is necessary to protect network infrastructure and related assets from unplanned events that could disrupt network operations. It doesn't have to be hundreds of pages long; a one-page plan with the right information can be more valuable than a voluminous document that nobody can use.
- Use business continuity standards as a starting point. Standards provide an excellent starting point for developing a new plan or evaluating an existing plan. Two important standards for network DR planning are ISO/IEC 27031:2011 Information technology -- Security techniques -- Guidelines for information and communication technology readiness for business continuity, which is expected to be updated within the year, and NIST SP 800-34 Contingency Planning Guide for Federal Information Systems.
- Keep it as simple as possible. Depending on how complex existing networks are, DR plans should reflect that same level of structure and complexity.
- Limit content to actual disaster response actions. When creating a plan to respond to specific network-related incidents, include only the information needed for the response and subsequent recovery.
- Ensure availability of network services, hardware and software. Networks are a mix of hardware (routers, switches), software (network management) and network services (WANs, internet, local access). Each element in a network is at risk, and regardless of the type of resource, it is essential to identify ways to back them up, make them more geographically diverse, and build inventories of spare parts and software.
- Keep the plan up to date and test it often. Once the plan is complete, test the plan at least twice annually -- more often if the network configuration changes -- to ensure documented procedures make sense in the sequence indicated, and that the plan can be effectively executed.
- Be flexible. A single network disaster recovery plan template might not be applicable to all organizations. Especially if the organization uses many corporate locations served by the network, multiple data centers and services provided by third parties.
Network disaster recovery plan components
The included network disaster recovery plan template is a great way to get started with creating a plan or adjusting an existing one. When using the template, the following are key issues to address and activities to perform:
- Initial data. Locate the contact data of primary and backup networking staff at the front of the plan to save time from paging through a lengthy document.
- Revision management. Have a page that reflects change management activities such as revisions, the date of the revision and who approved it.
- Purpose and scope. Provide details about these elements, as well as any assumptions, team descriptions and other relevant information. Include details on how often the plan is to be reviewed and updated, and by whom.
- Emergency instructions on how to activate the plan. Provide data on circumstances under which the organization will activate the plan. This should include outage time frames, who declares a disaster, whom to contact and response procedures to be used.
- Policy information. A network DR policy is advisable, along with other IT policies, to provide structure and guidance on network DR planning. If the IT department has a business continuity and disaster recovery policy, include relevant network DR policy information and reference the use of standards.
- Details of the plan. Wherever possible, provide step-by-step procedures. These are easier to follow than broad general statements such as "Reconfigure network channels to alternate location," which might require significant detail to complete properly.
- Checklists and flow diagrams. Assuming a network disruption has occurred, identify steps to address it. These can be in the form of checklists and flow diagrams that provide a high-level view of response and recovery.
- Information gathering. It is necessary for IT teams to gather information before officially declaring a network disruption. This includes network performance data and firsthand reports from IT staff, employees and first responders, if needed. Convene meetings as soon as possible with key IT network emergency team members to evaluate the facts before proceeding to a declaration.
- Declaring a disaster. Once IT has obtained the initial facts about the network disruption, the plan should list actions to take when it becomes necessary to declare a network disaster. Specify who is authorized to declare a network disaster.
- Recovering from a disaster. Once the situation has been brought under control, subsequent parts of the plan should provide instructions on recovering and restoring network operations, restoring network connectivity devices, and related activities.
- Plan testing. Specify the activities associated with tests of the network DR plan. This can include network vendors, equipment vendors, types of tests and who is involved in tests.
- Plan review and maintenance. Establish a process, typically using the company's change management functions, to make changes to the network DR plan. Also establish a schedule of plan reviews to ensure the plan remains appropriate and actionable.
- Appendixes. Detailed appendixes should be provided at the end of the template. Include lists and contact details for all IT and non-IT emergency teams, primary and alternate network vendors, primary and alternate equipment and software vendors, alternate network configuration data, primary and alternate sources of hardware and software, and other relevant information. It is critical to keep this information up to date.
Alternate network recovery capabilities
From a network DR perspective, an alternate network recovery facility can be an important component of a DR plan. Businesses typically use these to recover and restore IT infrastructure and operations when a primary data center or other network resource becomes unavailable.
Cloud vendors and managed service providers are available to provide backup network support and recovery. With these resources, it might not be necessary to have a separate internal recovery site. The third party might be able to accommodate most or all network infrastructure and connectivity issues.
An alternate network DR site can be implemented in a second data center owned and operated by the company -- or another organization the company can depend on -- to recover and resume operations in a network disruption. MSPs and cloud vendors offer a variety of network DR services, as well as primary carriers.
Organizations with aggressive recovery objectives and large data processing requirements might use this model via a cloud vendor or MSP to serve as the data center and network operations center.
Common mistakes of network DR planning
Redundancy and diversity are the fundamental components when planning a resilient and survivable communications network. With that in mind, businesses often make a few common mistakes when preparing a network DR plan.
Companies often don't take the time to examine the network infrastructure of their primary and alternate carriers outside their building when assessing the redundancy and resilience of voice and data networks. This oversight can result in a lack of organizational knowledge about the network infrastructure.
Where does service enter the building, and is there just a single entry point? Is service delivered using overhead wires or underground? If the former, where are the poles located, and are they in the path of oncoming transportation? If the latter, what type of conduit does the carrier use to carry service to the building? If a path is blocked, will voice and data service continue over another route? Knowing the answers to questions like these is critical to a network DR plan.
Internally, IT should design the network infrastructure with diversity and redundancy as much as possible. This is done to reduce the potential single points of failure that can take down the network. Network connectivity devices should be configured in a redundant arrangement, and an inventory of backup switches, routers and other network devices should be available. Consider the use of more than one ISP, or have an alternate ISP ready to go if service from the primary ISP fails. Ensure service providers can demonstrate their network diversity and redundancy, as well as their DR plans and how they help customers with network DR activities.
How to test a network DR plan
In most cases, it might be difficult to disable parts of a network to test its recoverability. This is especially true if the network segments carry production traffic, such as email and customer data. Since most organizations use services provided by carriers, the most appropriate starting point for organizing DR tests is to ask carriers how they can support network DR testing efforts.
Tests typically examine how quickly a specific network service, such as internet access, can be recovered and returned to normal. This can include shutting down network devices, such as switches, to see how quickly they can be brought back online. Taking down a more complex system, such as a VoIP switch, should be done out of hours or over a weekend. The test will determine how quickly the system software can be reloaded and the systems restarted and configured for normal operations.
Following a completed test, compile the results into an after-action report and update the network DR plan based on intelligence gathered from tests.
How to audit and maintain a plan
Auditing and maintaining a network DR plan helps ensure it stays up to date and continues to meet the needs of the organization. It ensures the plan addresses network technology issues, people and processes, and that it has the proper controls to initiate when the organization is confronted with an actual emergency.
The results of an audit can detect areas of the plan that are incomplete, out of date or untested, or that lack proper procedures and suitable documentation. A network DR audit should address the following issues and controls:
- DR policies and mission statement.
- Continual updating of written DR plan.
- Data, systems and network recovery.
- Backups of network facilities and systems.
- Network DR policies.
- Network security procedures.
- Testing of network DR procedures.
- Backups of data and systems needed for network DR.
- Designated network DR committee and chairperson.
- Listing of emergency contact information.
- Details on insurance relating to loss of network service.
- Service-level agreements addressing networks.
- Effective communications procedures.
- Up-to-date and validated network or system operational documentation.
- Documented emergency procedures.
- Alternates of essential personnel.
- Software, hardware and networking vendor lists.
- In-place automated and manual procedures.
- External service-level agreements and contracts.
- Awareness and training activities.
- Continuous improvement activities.
Paul Kirvan is an independent consultant, IT auditor, technical writer, editor and educator. He has more than 25 years of experience in business continuity, disaster recovery, security, enterprise risk management, telecom and IT auditing.