Getty Images

Tip

Top 5 IT disaster scenarios DR teams must test

While most organizations are prepared to face small-scale interruptions, they cannot overlook a larger, more complex crisis just because it seems less likely to occur.

Stuart Burns

By

Stuart Burns

Published: 08 Nov 2023

Typical interruptions IT teams prepare for are common events such as disk failures or power outages. However, there are several more IT disaster scenarios that businesses must address to be fully protected.

The root of many IT disasters is often that the people responsible for recovery did not consider anything beyond hardware failure or accidental or malicious loss of data. Unfortunately, threats and scenarios are always evolving, so disaster recovery plans must do the same.

There are many forms of disaster that can affect the availability of IT services, and some might be more relevant to individual organizations than others. It is a prudent move to assess which risks are most likely to threaten a company's infrastructure and services. A risk assessment matrix is one tool that can help determine the likelihood of a disaster occurring as well as its severity.

Below are five possible IT disaster scenarios that DR teams must prepare for and tips on how they can do that, regardless of business size and type, location, and infrastructure.

Failed backups

Failed backups are some of the most frequent IT disasters. Businesses can replace hardware and software, but if the data and all backups are gone, bringing them back might be impossible or incredibly expensive.

Some organizations might not realize that their offices lie in flood plains or earthquake-prone areas until it is too late. Mitigation against such issues takes a degree of forward planning.

Sys admins must periodically test their ability to restore from backups to ensure backups are working correctly and the restore process does not have some unseen fatal flaw. At the same time, there should always be multiple generations of backups, with some of those backup sets off site.

Natural disasters

Natural disasters can take many forms, including fires, floods and earthquakes. While the type of disaster might vary by region, just about all of them can damage hardware and cause data loss. Many can render the worksite inaccessible for long periods of time.

Some organizations might not realize that their offices lie in flood plains or earthquake-prone areas until it is too late. Mitigation against such issues takes a degree of forward planning.

The ability to fail into the cloud to keep core services working means that while not every application is available, those that are essential to run the business are. Building in infrastructure to make remote work a viable option is another way to prepare for a variety of natural disasters.

Having the abilities to fail into the cloud and work off site takes some forethought, planning and application, but pays massive dividends should a disaster occur. Repairing and replacing buildings and hardware can take more time than people estimate, and a business that is unable to function during recovery is at risk of serious financial losses.

Example of a color-coded risk assessment matrix. — DR teams can use a risk assessment matrix to determine the likelihood and severity of different IT disaster scenarios.

Ransomware attacks

Ransomware is not only one of the most damaging disasters that can happen to a business, but it is perhaps the most likely as well. It only takes one person with sufficient privileges to click on a wrong link to cause chaos.

Defending against ransomware is neither trivial nor cheap. A lot of modern ransomware has intelligence to make sure that it does not activate until after it has compromised several generations of backups.

There are many ways to reduce the risk of a ransomware attack, but no single preventive tool. Keeping application and OS patches up to date, scanning email for questionable attachments, restricting access to external media and providing good user education will help.

Network interruptions

This IT disaster scenario is one that happens often, unfortunately. For example, heavy machinery can sever cables, rendering the network inaccessible. Network interruptions are an increasingly urgent concern as more IT systems become SaaS-based. Network connectivity is essential to join and use the SaaS system.

Fortunately, the fix for this has become easily available and inexpensive in recent years. A secondary line is one option for small businesses, and most network routers offer 4G or 5G networks as a backup. While not ideal, it makes network interruption less of a disaster and more of an inconvenience. Incorporating backup connectivity does have a cost, but it might be worth it when the alternative is an office full of staff who cannot work.

Hardware failure

Hardware failure can take many forms, including a system not using RAID, a single disk loss taking down a whole system, faulty network switches and power supply failures.

Most hardware-based IT disaster scenarios can be mitigated with relative ease, but at the cost of added complexity and a price tag. One example is a database server. Such a server can be turned into a database cluster with highly available storage and networking. The cost for doing this would easily double the cost of a single nonredundant server. Administrators would also have to undergo training to manage such an environment.

Hardware failure can affect the cloud as well. However, it is usually abstracted out, and there are several copies of the data to rebuild and continue with.

Stuart Burns is a virtualization expert at a Fortune 500 company. He specializes in VMware and system integration with additional expertise in disaster recovery and systems management. Burns received vExpert status in 2015.

Dig Deeper on Disaster recovery planning and management

Part of: The essential guide to BCDR testing

Up Next

Best practices for a strong disaster recovery testing strategy

Testing is a critical part of the disaster recovery planning process. Without proper testing, IT teams might miss crucial updates or make avoidable missteps in a recovery.

What are 5 good reasons to do yearly disaster recovery testing?

Do you think yearly disaster recovery testing is overkill? You're not alone, but you are missing out on some key ways DR testing can help backup and recovery efforts.

Top 5 IT disaster scenarios DR teams must test

While most organizations are prepared to face small-scale interruptions, they cannot overlook a larger, more complex crisis just because it seems less likely to occur.

Free business continuity testing template for IT pros

Business continuity testing can be a major challenge for any organization. This free template offers ways to incorporate testing into the business continuity management process.

Search Data Backup

How AI is changing data protection
There's no doubt that recent AI developments have affected data protection. IT leaders must not only stay on top of these changes...
Treat HIPAA backup rules as infrastructure, not decorations
Healthcare backup systems designed for recovery and retrofitted for HIPAA produce audit gaps. Encryption, access logging and ...
Geopolitics reshape data protection plans
Business and technology leaders are revising their data protection plans as global conflicts challenge current resilience and ...

Search Storage

SSD & memory price increases and what can you do about it
Memory and SSD prices are rising to unprecedented levels due to AI infrastructure investments, forcing IT leaders to rethink ...
WEKA pushes full-stack approach to keep GPUs fed
WEKA addresses AI inference bottlenecks with WEKApod 3, its first custom-designed hardware, paired with NeuralMesh 6 software to ...
NetApp-DataPelago deal ups AI data management ante
The acquisition strengthens NetApp's support for distributed data management, a key challenge for enterprises seeking to adopt AI.

Search Security

CISO's guide to data obfuscation
With so many services requiring access to sensitive data, encryption alone is not enough. Data obfuscation has evolved into a ...
A CISO's guide to security data lakes
A security data lake gives organizations a centralized repository of security information, but it can pose governance and ...
CISO's guide to privileged identity management
IAM is more crucial than ever in the AI era. To better control who -- and what -- is accessing systems and data, security teams ...

Search CIO

How the AI Executive Order shifts vendor management strategies
June's AI Executive Order promotes voluntary engagement between AI developers and the federal government, emphasizing vendor ...
When should CIOs care about quantum computing?
Quantum computing is still in its early stages, but CIOs in some industries should pay attention now. Learn where the technology ...
AI slopification: The true cost of low-quality AI implementation
AI slopification in business processes creates a costly debt cycle of rework, damaged customer trust and lost productivity that ...

Search ITOperations

Stateless MCP seen as step forward for enterprise AI
Experts say the project's update this week will make it more suitable for enterprise production use, but it comes with a breaking...
Build, buy or rent: A framework for enterprise AI infrastructure
AI infrastructure decisions hinge on utilization, workload maturity, data and timelines. Most enterprises will get the best ...
Harness Agent DLC targets AI agent development gaps
DevSecOps for AI agents requires more than code review -- Harness beefs up behavioral and security controls and hints at a ...

Close