Sergey Nivens - Fotolia
The pros and cons of cloud disaster recovery
Cloud DR is a great option, but it's not as easy as it might look. For starters, you'll need to carefully prepare data and have servers prebuilt as VMs in the cloud.
For some people, disaster recovery -- and especially cloud disaster recovery options -- means you back up data to the cloud. There are tools and procedures that enable an organization to do this quite effectively. The catch is that this is not DR. It's backup.
Having a copy of your data off site is a good idea, but it's a brick of data, and that alone can't be converted into a functioning data center. You can't simply attach a few virtual machines to that data and get a data center up and running.
Backups are fundamental to a disaster recovery strategy. You need them to bring your environment back online. It's the framework around that data that makes it a full disaster recovery plan. This can encompass additional servers, storage, networking, personnel and a host of services designed to meet recovery time objective (RTO) and recovery point objective (RPO) goals, be they several hours to just a few minutes. Sometimes this was done with remote data centers that were set up and configured to resemble the main site, but maybe on a smaller scale. Because it is expensive to create and maintain these locations, not a lot of companies had them.
The arrival of cloud computing has opened new and seemingly ideal possibilities for DR in the cloud. Cloud disaster recovery provides a way to retrieve data from a location that is unaffected by whatever incident caused you to lose your on-premises environment.
Still, there are things to know about how the cloud can be of use for DR purposes. These factors can be placed into two categories: data and infrastructure.
The data piece to cloud DR
For the data, it is possible through multiple means to duplicate data from your on-premises location to a cloud. Most clouds today have storage gateways to help facilitate this. If you have enough bandwidth, you can have almost near synchronous data.
You may incur charges for transferring data into a cloud disaster recovery environment. Or you may not. This will depend on the cloud provider you choose and whether it involves object storage or file storage.
For the most part, providers will charge little or nothing to move data in. The same cannot be said for accessing or storing it. Depending on how frequently you need the data or how much space you require, this cost can grow quickly. And, remember, you'll pay this expense each and every month, and your fee will most likely never go down.
You have the option to go with lower-tier storage, but this can increase the time for retrieval from minutes to hours. It's worth stopping to ask if that sort of delay is really what you'd be happy with in an actual disaster recovery situation.
The infrastructure piece to cloud DR
While storage of your data is critical, it's somewhat straightforward. The necessary infrastructure, however, is an entirely different situation.
You can't simply replicate your VMs into an Amazon S3 bucket and expect to switch them on when you need them. That's not how the cloud works. You need to have those servers prebuilt as VMs in the cloud. That way you can connect them to your storage when needed and complete your disaster recovery.
It's worth noting that even if those VMs are not powered up, you still pay for them. If these were free, everyone would be doing that -- and the cloud providers would be out of business.
Having a collection of VMs waiting to be used in the cloud can cost almost as much as a new data center if you keep them long enough. Services aren't free and available based on what's ideal for you. Cloud providers are in business to make money.
The newer option of infrastructure as code (IaC) might be of interest. This is where you deploy the necessary infrastructure to support your data center when it's needed rather than keeping it active or on standby.
This can be a game-changer in the effort to control costs when looking at cloud disaster recovery, but it is not a perfect solution to the problem. To deploy infrastructure on demand in code means that what you have on site is in sync with the code you would use in the cloud. So, as the on-premises environments change, you will need to update your IaC to reflect that. That's not impossible to pull off, but it will require high levels of discipline and strong coding knowledge.
The length of time it would take to deploy your IaC environment is another complication. Spinning up dozens or hundreds of servers at a moment's notice with code doesn't happen quickly. The process could take hours depending on how much automation you have in place and how many last-minute changes you will need to make. Now you could have a combination of both warm servers that you're paying for each month combined with IaC deployed to complete the environment.
The problem is this best-of-both-worlds approach requires resources to keep everything in sync and ready. It can be more obtainable if you scale back some of what you're going to tag for disaster recovery. After all, not everything needs to be a top priority in the context of DR.
Words of caution
Disaster recovery in the cloud is possible, but it will require some serious planning. And management needs to know that it will probably be costly. Sure, it might not be as expensive as operating another data center, but it might not be far off once you start adding up those monthly bills over multiple years. The keys to cloud disaster recovery will be automation and coding -- unless you have the money to simply replicate what you have on premises into the cloud.
Another key point to remember is that if you are recovering from an incident or disaster that is specific to your business, cloud-based disaster recovery should be perfectly effective. But if your problem is caused by a natural event or large-scale internet outage, you might be competing with your neighbors locally and others far away for cloud resources. In those situations, you won't be the only one involved with cloud-based disaster recovery. And while your disaster recovery tests might work well in practice, conditions will be different during a widespread outage. In that situation, you won't have much control.