6 steps to survive a cloud outage


Step 4: Put your DR plan into action

In many cases, mission-critical DR plans may be fully automated and administrators may not need to take any deliberate action. For example, a cluster spanning AWS availability zones, or Azure regions, may continue to function even when one node becomes unavailable during a cloud outage.

However, less critical workloads might require deliberate action. Turn to prepared scripts, templates or other resources to orchestrate the appropriate DR response. Admins must take immediate action when the business decides to initiate a DR plan that requires manual intervention. This might include restarting from a snapshot or redirecting traffic to a standby instance for the duration of the cloud outage.

DR plans require periodic testing. Perform testing drills to ensure proper procedures and resources needed to drive workload recoveries are in place. Testing also verifies and validates the configuration of associated resources, such as IP addresses and related drivers and dependencies. If the recovery functions properly during regular tests, it's likely to function properly in actual DR situations.

