Roman Milert - Fotolia
Backups without testing are potentially worthless. Harsh, I know, but when you consider the importance of the data, applications, systems and workloads those backups contain, if you never test them, you run the risk of recovery failure.
It's necessary to have some kind of testing plan in place to ensure your backups will actually do what you believe they will should a restore scenario occur. In this article, I'll break down backup testing into three basic steps you can follow, providing options along the way to meet specific testing needs.
Step 1: Determine which backups require testing
You might be thinking every backup set requires testing, but that's not always feasible. Testing every single backup created would pretty much be the same as performing an enterprise-wide recovery. Instead, prioritize those backups that are so critical you want to ensure a successful recovery of that same data set.
Here are some ways you can approach determining which backups should be tested:
- Start with data, systems, applications and complex workloads. If these parts of your operations are critical enough to have a very small recovery time objective, they are important enough to have their backups tested.
- Continue on to operational functionality. No recovery is complete without actually connecting users to the restored workloads. You must extend your thinking about backup testing to include user endpoints and some verification that the workloads you've restored actually work.
- Consider dependencies. It's likely (nearly) none of your backed-up workloads work independently. They need other systems, files and directory services to function. While we're venturing a bit into disaster recovery (DR) territory here, it's important to be thinking about the potential need for any dependent workloads to be restored as part of your testing.
Step 2: Determine how you will test backups
In general, there are different kinds of backup testing that organizations use to validate their backups. Depending on your staffing, expertise and comfort level, either way of testing is better than doing no testing at all. The goal is to simply validate the data integrity of the backup created. This is accomplished using products offering the ability to verify backups. Possible testing aspects include:
- Testing to boot up. In cases where backups of entire virtual machines are created, many backup products support the ability to recover a system in question and see that it boots up all the way to the Windows logon screen. You can also perform this task manually.
- Testing system functionality. This test takes the previous method a few steps further. Check to see if services start; if the system responds to pings or application-specific interaction (e.g., hitting TCP port 25 on your Exchange server to get a response); or even if user interaction is possible, resulting in an assumed system response. Some backup products incorporate orchestration capabilities to automate these kinds of advanced tests.
- Testing recovery. While this has its roots in DR, there are proven methods of testing recovery plans that can be applied to backup testing. These include tabletop exercises of the restore process, performing scenario-based restore simulations (meaning, you perform the necessary restores based on a given situation and ensure everything restores properly), and a full restore simulation of the entire environment.
Step 3: Determine backup testing frequency
Once you've bought into the need to test backups, this gets either really easy or moderately difficult -- all depending on whether you can automate testing with your backup product. Putting that aside for a moment, let's discuss how you establish the answer to the question of, "How often should I test?"
The simple method is to look at the criticality of the system, application or workload and, based on how important it is to ensure the ability to restore its backup, determine a testing schedule.
Here are my recommendations regarding the frequency of backup testing:
- Static workloads. If a given combination of systems, applications and services does not change frequently, with the exception of the data it utilizes, the backups aren't changing either. This means any of a number of backups may potentially be able to do the job of restoring this workload should your last backup set not be good. If you have no automated way to test backups and the work will be accomplished manually, perform a backup test at least once annually. If you can use automation, consider more frequent testing.
- Mission-critical workloads. Sometimes even those workloads you simply can't function without fall into the "static" category, in that they largely don't change much, except for their data. Even so, because of the critical nature, backups should be tested on a much more frequent basis. With no automation, quarterly provides adequate coverage. With automation, more frequent backup testing makes sense.
- Data. This is a tough one, as the only way to conclusively test out the viability of backed up data is to put it into use. This is why I've spilled ever-so-slightly into the realm of DR. In cases where you want to test the backups of data utilized by your applications, it may be necessary to perform a simulation-type recovery in order to see that the backup is good. Testing of data-only backups should align with the workloads they service, keeping in mind it may require more work to determine backup viability.