Building redundancy into your cloud outage strategy

Any effective cloud computing strategy should include measures to protect against outages. The best defense you can have is to plan for redundancy.

Cloud services aren't perfect. And even cloud leaders such as Amazon Web Services experience outages. The good news is that with cloud, you can protect yourself from cloud service provider outages in ways that weren't possible with traditional server hosting. The most important word to remember in your cloud computing strategy for outages is "redundancy."

In addition to regular backups, the best defense against cloud outages comes from planned redundancy. The idea is simple: If one server goes down, another server takes over, and end users don't even notice a problem.

The idea is simple: If one server goes down, another server takes over and end users don't even notice a problem.

There are several techniques for implementing redundancy. One option is to place your servers in multiple data centers. Amazon Web Services (AWS), for example, lets you choose where you want to host your servers. In this case, put one server in its Virginia data center and put a redundant server in its Oregon data center, for example. If your cloud provider doesn't offer multiple data centers, you could distribute redundancy among vendors -- run some servers on AWS and others on Rackspace, for example.

Next, have a management system and proper infrastructure in place so redundancy actually works in the event of an outage.

Cloud management software, such as VMware's vCloud Director, Microsoft's System Center and Cloud Lifecycle Management from BMC, can constantly monitor servers. If one server goes down, take it out of the set of active servers until you can bring it back up. For that to work, you need to configure your domain name system (DNS) servers so that when a client -- such as a browser -- looks up the IP address for a URL, the address returned will point to one of the servers that are up. This will route the client to an active server and skip servers that are down.

Large companies likely use a DNS configuration already, but smaller operations that aren't using sophisticated management software still have ways to implement redundancy. Keep a second server ready to go, but turned off. Some cloud vendors don't charge for servers that are powered off. In the event of an outage, fire up the redundant server, go directly to your DNS manager and switch the IP address to point to the new server. The effect may not be immediate, but with today's DNS servers, it should work quickly. DNS managers warn it can be 24 hours before the changes propagate, but in reality, it usually takes about 15 minutes. Note: If you maintain an additional server that's powered off, you should make sure you periodically update it with the latest version of software.

Taking the right measures and planning ahead will help enterprises better prepare for a cloud outage. If an outage does occur, end users will experience little downtime or possibly none at all.

About the author
Jeff Cogswell is a software engineer with more than 20 years of experience in different technologies and platforms, ranging from Unix and Linux, Windows, ASP.NET/C# Web development, PHP development and various database technologies. He has authored several books, including C++ All-in-One for Dummies, and runs trainings for Web developers and programmers. Visit his website at www.cogsmedia.com. 

Next Steps

Use active-active redundancy to protect against cloud downtime

Dig Deeper on Cloud infrastructure design and management

Data Center