You can more - Fotolia
Cloud is a great technology, but it can fail due to the sheer number of moving parts. When your entire infrastructure is in the hands of one vendor rather than multiple cloud providers, you'll be stuck when a major failure hits.
Some cloud providers offer multiple availability zones, so the system can be failed over in a disaster, but the provider will decide when to implement that scenario. The answer to this dilemma is to have capacity and failover ready to go with another cloud provider. Before that fateful day arrives -- and it will -- we'll explore some of the factors that you should take into consideration.
While it might seem unlikely that you're unable to fail over with one provider, all major cloud providers have had multiple major failures. Rather than getting caught off guard with an unexpected crisis, it helps to be prepared for the worst-case scenario.
Failing over and out
One thing a lot of newcomers to cloud DR fail to think about is that not all cloud products are the same. Most companies use a combination of platform as a service and infrastructure as a service. Contrary to popular belief, the PaaS offerings are easier to fail over than IaaS, as the way it is replicated is different.
One of the major issues with the larger cloud providers is that they seem to have little to no interest in allowing users to fail out of their cloud. When you can fail out, after all, you can easily migrate and move to an alternative cloud provider.
Large cloud providers have the capability to fail within region, to another data center, using their technology that comes at an additional cost, usually per virtual machine (VM). While this is good for protecting users from physical failure, it can't always prevent logical failures that affect multiple physical locations or the entire control plane.
It is possible to perform DR between multiple cloud providers and environments, but there needs to be a method of "translation" from one cloud vendor to another. This is where companies such as CloudEndure, Veeam and Zerto come into play.
Don't get lost in translation
While the replication side of using multiple cloud providers for DR is quite straightforward, you may encounter formatting issues if you're just moving the bits between sites. The replicated machines have to be converted into a usable, bootable format with required files injected during the reconfiguration. It's doable, but performing this one machine at a time would have a negative effect on the recovery time objective.
Making multiple conversions at a time will bring down the length of time for implementation but takes a powerful server, and the more powerful the server, the more it costs. This may not be an issue for some companies, as a larger budget or business continuity insurance will pay for it. If the costs are not an issue, it comes down to the quicker, the better.
Some vendors also place constraints on the volume of data that can be allocated to a single VM; therefore, the DR providers have rewritten their products to optimize throughput.
However, keep in mind that, when working with multiple cloud providers, whatever gets failed over needs to get failed back at some point. In a DR scenario, would the company want to fail over their entire fleet of VMs or just a select few to ride out the issues at the primary provider? On top of that, the data ingress and egress charges could be significant, depending on the configurations involved.
While cloud-to-cloud failover may be appropriate for some users, it really depends on a number of factors, including the cost of downtime and the upfront configuration that needs to go into the design. Irrespective of whether or not the infrastructure is used, there can be significant ongoing costs for support and resource consumption.
My advice for those looking to do cloud-to-cloud recovery is to identify the key infrastructure that needs to be failed over to maintain an acceptable level of services while keeping costs reasonable. Secondly, don't try to do these failovers manually. Where possible, use APIs to do the heavy lifting. In times of stress, such as DR failover, having it done without error by a computer will potentially save a lot of heartache later on. Make sure that it is well documented and those who need to know how it works do so. Lastly, ensure that you have the appropriate access to both clouds to manage the infrastructure.