Khunatorn - stock.adobe.com

Tip

Best practices for a strong disaster recovery testing strategy

Testing is a critical part of the disaster recovery planning process. Without proper testing, IT teams might miss crucial updates or make avoidable missteps in a recovery.

Stuart Burns

By

Stuart Burns

Published: 29 Jan 2024

Good disaster recovery testing comes from thorough planning and preparation. An untested plan is another crisis waiting to happen, so it is critical to have a disaster recovery testing strategy in place.

Full disaster recovery plan testing is not something many organizations can do frequently. To plan and execute a disaster recovery test requires two valuable resources: time and money. For that reason alone, DR teams must be realistic in how many tests they can execute each year. Most major applications are only end-to-end tested once a year at most. Some applications can be tested once every three years. It depends on the DR team's requirements.

This places disaster recovery teams in a dilemma: If they can't test often enough, critical applications or processes could miss out on necessary updates. However, if they spread themselves too thin with extraneous testing, they risk using up the aforementioned valuable resources. A testing strategy must be almost as thorough as the recovery itself. This will ensure DR teams don't miss out on any required changes and can use even limited resources to the fullest.

To get the most out of a disaster recovery testing strategy, consider incorporating these best practices.

Determine the type of test and plan accordingly

Disaster recovery testing comes in two types: full DR test and component test. The difference is that component tests are smaller in nature and test a subset of the application. Most component tests are effectively a smoke test to help ensure the smaller parts of the overall application are working before committing significant resources to a full-blown DR test.

Before talking about the technical aspects of the test, it's critical to understand what is being tested. Is it a full interactive disaster recovery test with users being asked to log in, perform in a crisis scenario and prove that the application works as expected? Or is it enough to verify that the systems and software are available? Depending on the tools or processes in an organization's DR plan, it might be necessary to perform a full run-through of the plan to test how it will run in a crisis.

Ensure everything is in place early -- and double-check

It might seem trivial, but not checking key components before running a full test is one of the most common and preventable mistakes organizations make. The point of a DR test is to ensure things work as expected, but when there is a fix that can be done outside the full test, it's worth it to check that everything is all set beforehand. This is one area where component testing can come in handy.

A frequent example is when an IT team discovers that required firewall ports are not open. This is something they might find during the full DR test, but it's still easier to check ahead of time to preserve time and resources. Remediating firewall issues can be a frustrating process, and it's likely not something security and networking staff want to deal with in the middle of running an end-to-end DR test.

Good documentation is evergreen

The importance of good documentation is paramount. If a DR test is done by less experienced staff, they might face and resolve several problems along the way. However, if they don't document those issues and the remediations, that loss of important information can significantly affect the speed of the DR test or real recovery.

There are four types of documentation DR teams must have for a strong testing strategy:

The current DR plan as written, with discrete steps and a schedule.
Notes on any issues that came up during testing and how they were fixed. If there was a temporary workaround, outline what it was.
Detailed documentation of the testing process. This should include what is being tested and by whom.
Admin sign-off on test completion.

Don't bypass thorough wrap-up and reporting

It might seem simple, but post-test reporting is where many DR teams fall short. Unfortunately, this is the task that has the most impact and presence to the management level.

Management is not often interested in the nuts and bolts of IT, but relaying the success or failure at a high level is a complex undertaking. This is especially true when a production system is taken down to test a DR scenario. Just like with a real disaster, IT teams should create comprehensive documentation throughout the process to inform management of how the test went and any areas they must address.

To avoid overloading management with technical details during wrap-up, timely communication of high-level status during the test is critical. Keep in mind that some DR tests can be quite lengthy in execution, spanning 24 hours or more. Ensuring those key stakeholders stay apprised of what is happening keeps them happy and shows good communication.

Stuart Burns is a virtualization expert at a Fortune 500 company. He specializes in VMware and system integration with additional expertise in disaster recovery and systems management. Burns received vExpert status in 2015.

Next Steps

Disaster recovery plan best practices for any business

Dig Deeper on Disaster recovery planning and management

Part of: The essential guide to BCDR testing

Up Next

Best practices for a strong disaster recovery testing strategy

Testing is a critical part of the disaster recovery planning process. Without proper testing, IT teams might miss crucial updates or make avoidable missteps in a recovery.

What are 5 good reasons to do yearly disaster recovery testing?

Do you think yearly disaster recovery testing is overkill? You're not alone, but you are missing out on some key ways DR testing can help backup and recovery efforts.

Top 5 IT disaster scenarios DR teams must test

While most organizations are prepared to face small-scale interruptions, they cannot overlook a larger, more complex crisis just because it seems less likely to occur.

Free business continuity testing template for IT pros

Business continuity testing can be a major challenge for any organization. This free template offers ways to incorporate testing into the business continuity management process.

Search Data Backup

What is endpoint data loss prevention? A best practices guide
Today's mobile workforce puts company data at risk. Endpoint data loss prevention secures sensitive info at the source, reducing ...
12 leading courses in data backup training for IT teams
Data backup training covers key aspects of data protection that are essential for compliance and risk mitigation. Here are 12 ...
9 backup as a service (BaaS) providers in 2025
BaaS is available in public, private and hybrid varieties and from numerous vendors. Here's how to evaluate the options to find a...

Search Storage

AI dominates 2025 Future of Memory and Storage conference
Techs discussed at the Future of Memory and Storage event included high-capacity and high-performance SSDs, all against the ...
Nasuni enters new era with enhanced intelligence, exec changes
The combination of new product capabilities, strategic vision for AI readiness and leadership additions suggests Nasuni is ...
Compare DRAM vs. DRAM-less SSDs for cost, performance
Are DRAM-less SSDs the way of the future? While they're a great fit for hyperscale data centers, other organizations may also ...

Search Security

Red vs. blue vs. purple team: What are the differences?
Red teams attack, blue teams defend and purple teams facilitate collaboration. Together, they strengthen cybersecurity through ...
News brief: Safeguards emerge to address security for AI
Check out the latest security news from the Informa TechTarget team.
How outer space became the next big attack surface
VisionSpace Technologies' Andrzej Olchawa and Milenko Starcik discussed a set of vulnerabilities capable of ending space missions...

Search CIO

U.S. could feel effects of EU AI Act as companies comply
The U.S. may be making a deregulatory push on AI, but the EU AI Act means large U.S. AI developers must comply with AI ...
Trump shifts U.S. competition policy
While revoking former President Joe Biden's executive order on competition may make M&A more favorable for tech companies, it ...
How to become a Web 3.0 developer: Required skills and guide
Becoming a Web 3.0 expert means mixing old and new skills.

Close