Disaster can strike any data center on any day. Money, time and effort enable physical IT assets to be fully restored...
-- often to states exceeding predisaster levels. The same, however, is not always true for crucial data, which may be lost forever unless it has been properly retrieved from backups and carefully reconstructed.
Here are five things you should know about data recovery and post-disaster data reconstruction, before doomsday arrives.
Data recovery and testing are the most important things any organization can do to prepare against data loss. "These days, disasters can include being hacked or having your data ransomed," said Mike Orosz, senior director of threat services and technology transformation at Citrix. "If your data is already replicated and available in more than one place, that 'disaster' isn't going to be the end of the world, because you've already planned for it."
"Businesses should identify what their tolerance is for potential data loss and operational downtime by performing a business impact analysis," said Tom Reynolds, director of technology solutions at Razor Technology, an IT managed services provider. Such an analysis will allow the organization to determine two key recovery metrics: the recovery point objective (RPO) and the recovery time objective (RTO). The RPO represents how much data the business can stand to lose, while the RTO indicates how long the business can be without functional systems during a recovery. "Once these metrics are determined, proper technologies can be put in place to ensure that the desired levels of recoverability can be achieved."
The data recovery process should start the moment a disaster is declared and confirmed. Tomas Honzak, director of security and compliance at data analytics specialist GoodData, noted that smart organizations have documented data recovery that outlines the steps that need to be taken during an emergency to protect vital assets, including data. The also establishes the decision criteria for assessing whether a situation is a disaster and designates the officer(s) who are authorized to make key decisions. "Otherwise, incident managers must reach out to the company's executive team and obtain the authorization to move forward," he noted.
Once the root cause of the data loss has been identified and chain of custody concerns are ruled out, it's necessary to identify the incident's scope and the exact data loss period, said Joe Kurfehs, senior consultant at SystemExperts, an IT security and compliance consulting firm. Inventory and take possession, if necessary, of the backup media. "Collect any available transaction logs to be used for the reconstruction and verification purposes," Kurfehs said.
Check the data for integrity
Data integrity checks, along with restoration validity, should be performed on a routine basis for all backup data and media. "If there is truly a crisis-level event, the backup servers should be verified via an incident response team to ensure they were not touched, manipulated or harmed by an adversary in any way," said Sean Mason, incident response director for Cisco Security Advisory Services.
"Test to make sure the replication of the data is complete," Orosz said. "Gone is gone, so make sure you have the data backed up before you need it."
Expect to lose some data
During any restore process, it's possible that some data won't be recovered. Such damage can be minimized by adopting near-real-time replication as part of your data recovery . "However, no method will ensure complete restoration of data in all situations," Kurfehs said. "Gaps in data must be recreated using transaction logs."
Recovery teams should also reach out to affected business managers for assistance, Honzak said. "In some cases, there will be alternative sources of data -- such as invoices sent out via email -- that can be used to recreate the missing data manually or even semi-automatically via [optical character recognition] or the parsing of derived documents and files," he explained. "If there is no way the data can be validated automatically, business [managers] must be made aware of all the missing or potentially incorrect records."
Anticipate a much heavier than normal workload
When post-disaster data recovery involves hundreds, or perhaps even thousands, of machines, it's unlikely that existing staff, using possibly limited resources, will be able to restore data in time to ensure business continuity.
"Proactive should take place to better understand the true impact and recovery time," Mason said. "It is also imperative to [determine] how much work the recovery [project] needs by bringing in outside assistance to help manually restore the data."