McCarony - Fotolia
Data retention policies are a serious matter and it is critical to consider the long-term consequences of implementation. When admins develop a policy for data retention, they must consider the reason why the organization archives data in the first place.
There are two questions that will have a major effect on the way an organization constructs its data retention policy:
- Does the IT department need to free up space on some of the servers?
- Are the servers' contents becoming so cluttered that it's becoming increasingly difficult to locate data?
Retention policies do not just indicate what data an organization must keep, they also let admins know what data is okay to delete. In an age where people are generating massive amounts of data, it can be easy for critical data to get lost underneath a pile of redundant or unnecessary copies.
Once an organization identifies that it must free up space to better locate critical data, there are several backup retention best practices that will help create an optimal policy.
What is a backup retention policy?
A backup retention policy is an internal organizational rule that determines what data the organization keeps, where it keeps the data and how long it keeps the data. Retention policies may indicate the types of backup that are acceptable. For example, it may stipulate that the data must be on multiple backup mediums, such as tape or cloud. Different methods, such as full backups, incremental backups and differential backups, may also be part of a retention policy.
Retention policies exist for numerous reasons, and often ensure that customer or client data is secure and accessible. Industries such as healthcare, education, IT and retail will all have different requirements in their respective retention policies. Some retention policies also may include rules for data that must be deleted after a certain period.
Retention policy considerations
There are two major factors admins must consider when they craft a retention policy: what data to retain and compliance. Businesses create huge amounts of data, so policies must determine and document what data the organization cannot delete. Adherence to compliance laws and regulations is also critical to avoid fines or other penalties.
What data to retain. Some data is required by law to be retained for a certain time frame, while other data is nice to keep around but isn't legally required by a retention policy. Organizations can also implement internal rules around what data they retain and for how long, assuming they meet or exceed compliance regulations. Common types of retained data include files, email messages and database records.
One of the first backup retention best practices to keep in mind is knowing what data should remain live and what data the organization should archive. Typically, this determination will be made based on data age, but that is not always the case. In some cases, admins examine criteria such as when the data was last accessed and the data type.
For example, an organization may have plenty of free space on the file server but want to cut down on some of the clutter. With this data deletion goal in mind, it decides to create an archive policy that moves anything older than five years to the archives and then deletes anything more than 10 years old.
Although this might sound like a reasonable approach to creating a data retention policy, it may have unwanted consequences. What happens if a spreadsheet was created six years ago but is regularly updated? If the data retention policy only looks at the creation date, then the spreadsheet would be archived, even though it is regularly used. It tends to be much more effective to base a retention policy on the last access date rather than the creation date.
Data retention policies can also backfire in other ways. For example, if an organization signed a 15-year lease for its office building 11 years ago, in all likelihood, nobody has looked at the document in the last 10 years. However, they probably want to keep it. The policy should consider instances of this nature.
A data retention policy should be comprehensive, but also easy to manage and enforce, so being concise and clear is important as well.
Compliance. One of the major reasons for a company to retain data is compliance. In addition to a company's internal compliance rules, there are several laws and regulations that a company needs to consider in forming its data retention policy. It's important to figure out the applicable laws; an outside auditor can help.
The European Union's GDPR, for example, which went into effect in May 2018, features mandates applied to personal data produced by EU residents, no matter where it's stored. A data-collecting organization should have a data retention policy that specifically outlines GDPR compliance issues.
Other regulations that feature data retention requirements include the Sarbanes-Oxley Act and the Payment Card Industry Data Security Standard. Especially as it relates to these regulations, an organization should only keep personal data that's needed.
Regulatory compliance is a common business concern. Penalties for violations include fines and loss of reputation. A data retention schedule within an internal policy can be a helpful tool for compliance. Organizations must also ensure policies are updated to reflect changes in data production and compliance or data security laws.
Backup retention policy and scheduling checklist
Nuances in retention policies will vary by organization, but the checklist below outlines some basic, necessary steps to outline a solid backup retention plan. Admins can use the following schedule when they create a data retention policy:
- Define the data.
- Organize the data by lifecycle.
- Determine the number of versions to store.
- Outline backup type and frequency.
- Create a lifecycle policy for each dataset.
- Delete and purge unnecessary files.
- Review and run the backup retention policy.
The final step is crucial to a successful policy, so don't skip out on the review process. This step will let admins know if there is anything that they must update, ideally before it affects client data.
Best practices for backup retention policies
Below are some backup retention best practices that admins can reference when they create a new policy for their organization. Some regulations and security concerns may not affect certain organizations, but general best practices for backup admins include the following:
- Consider how industry regulations will affect the retention strategy.
- Always consider backup datasets, type and frequency.
- Identify and address restoration scenarios.
- Keep incremental backups within a reasonable size.
- Keep the last backup in an easily accessible spot.
- Evaluate cycle-based vs. time-based backup retention.
- Ensure there is enough storage for data backups.
- Schedule backups when the organization has the most available bandwidth.
Where to store your backups
A retention period is often determined by rules and regulations. Since retention periods range from minutes to years, an organization may need different types of media for storing data.
The public cloud is a popular storage location for long-term retention. Amazon S3 Glacier, Microsoft Azure Blob Storage and Google Nearline are among the options for low-cost archival storage in the cloud. The storage is off site, which is good for data protection. Restore times and costs can run high, though, depending on how much an organization needs to bring out of the cloud.
Tape is another media type for long-term storage that is cheaper than other options, such as disk. Durability is typically stated at up to 30 years for the latest LTO tape cartridges. LTO-9 provides 45 TB of compressed storage capacity. Restore speed is slow, however, so an organization shouldn't solely use tape to retain data that needs quick recovery.
Disk is more expensive but faster than tape. It's not a cost-effective place to store lots of data that needs long-term retention and probably won't be accessed by the organization frequently.
A solid backup retention policy may use the reliable 3-2-1 backup method. This commonly used rule states that organizations should create three copies of the data, the copies are stored on two different types of storage media and one copy of the data is sent off-site. This method may be a bit simple given today's variety of backup options but is a good rule of thumb for admins who begin to outline a policy.
Dig Deeper on Data backup and recovery software
Related Q&A from Brien Posey
When backing up SAP HANA databases, having the right tools for the organization is critical. Data snapshots are one option, but not the only one ... Continue Reading
Public bucket access is a prevalent and discussed S3 security issue. However, there are several other important security measures to take, including ... Continue Reading
Tape still plays a key role in backup, including offline protection from ransomware. What are some key improvements that will keep tape backup ... Continue Reading