A storage snapshot is a set of reference markers for data at a particular point in time. A snapshot acts like a detailed table of contents, providing the user with accessible copies of data that they can roll back to.
How storage snapshots work
Storage snapshots are often based around the use of a differencing disk. A differencing disk is a special type of virtual hard disk that's linked to a parent virtual hard disk.
When an administrator creates a storage snapshot, the underlying system creates a differencing disk that's bound to the original virtual hard disk. All future write operations are directed to the differencing disk, leaving the original virtual hard disk in an unaltered state. The file system is completely unaware of the existence of a differencing disk. File systems continue to function just as they would on a physical machine.
Snapshots have parent-child relationships and form a tree. Each snapshot taken creates another branch of the tree.
Snapshots are generally created for data protection, but they can also be used for testing application software and data mining. A storage snapshot can be used for disaster recovery (DR) when information is lost due to human error. Snapshots can also be useful for reverting a system back to a previous state if a bad patch has been installed.
This article is part of
What is data protection and why is it important?
Types of snapshot technology
Not all snapshots are based on differencing disks. There are several other types of storage snapshots:
Copy-on-write snapshots store metadata about the location of the original data without copying it when the snapshot is created. These snapshots are created almost instantly, with little performance effect on the system taking the snapshot. This enables the rapid recovery of a system in the event of a program malfunction.
The data in a copy-on-write snapshot is consistent with the exact time the snapshot was taken, hence the name copy-on-write. However, all previous snapshots must be available if complete archiving or recovery of all the data on a network or storage medium is required. Every copy-on-write process requires one read and two writes; data needs to be read and written to a different location before it's overwritten.
Clone or split-mirror snapshots reference all the data on a set of mirrored drives. Each time the utility is run, a snapshot is created of the entire volume, not only new or updated data. This makes it possible to access data offline and simplifies the process of recovering, duplicating or archiving all the data on a drive. This is a slower process, and each storage snapshot requires as much storage space as the original data.
Copy-on-write with background copy takes snapshot data from a copy-on-write operation and uses a background process to copy the data to the snapshot storage location. This process creates a mirror of the original data and is considered a hybrid between copy-on-write and cloning.
Redirect-on-write storage snapshots are similar to copy-on-write, but write operations are redirected to storage that's provisioned for snapshots, eliminating the need for two writes. Redirect-on-write snapshots write only changed data instead of a copy of the original data. When a snapshot is deleted, that data must be copied and made consistent on the original volume. The creation of additional storage snapshots complicates original data access along with the snapshot data.
Incremental snapshots create timestamps that allow a user to go back to any point in time. Incremental snapshots can be generated faster and more frequently than other types of storage snapshots. And because they don't use much more storage space than the original data, they can be kept longer. Each time an incremental snapshot is generated, the original snapshot is updated.
VMware snapshots copy a virtual machine disk file and can restore a virtual machine (VM) to a specific point in time if a failure occurs. VMware snapshot technology is used in VMware virtual environments and is often deleted within an hour. Administrators can take multiple snapshots of a VM, creating multiple, point-in-time restore points. When a snapshot is taken, any writeable data becomes read-only.
Continuous data protection
Continuous data protection (CDP) uses changed block tracking and snapshots to back up a system in a way that allows users to recover the most up-to-date instance of data.
CDP works by monitoring a storage device at the block level. Any time a storage block is created or modified, that storage block is automatically backed up. This allows a user to recover data with the most recent changes included, whereas those updates can be lost if a regular storage snapshot wasn't taken before the system failed.
CDP also keeps a record of every change that occurs, so it's always possible to recover the most recent clean copy of the data.
Storage snapshots and backup
Although snapshots offer backup-like capabilities, snapshots and backups are quite different from one another. Snapshots aren't intended to be a replacement for backups, although many modern backup systems incorporate snapshots.
Snapshots vs. backups
There are several benefits to using storage snapshots as part of a larger backup strategy. Snapshots are a quick and easy point-in-time recovery and can be used by backup applications to enable features such as instant recovery. Although storage snapshot technology is a helpful supplement to a backup plan, it isn't considered a full replacement for a traditional backup.
There are several reasons why snapshots shouldn't be used as an alternative to backups. First, snapshots can negatively affect a system's performance. This is especially true of differencing disk snapshots. Each time a snapshot is created, an additional differencing disk is created. The system's read performance diminishes with the creation of each additional differencing disk.
Another reason why snapshots aren't a suitable backup replacement is because snapshots are dependent on source data. If the source data is lost, the snapshot is gone as well. Unlike a backup, a snapshot doesn't contain a copy of the protected data and does nothing to protect the source data against loss due to hardware failure of storage corruption.
How storage snapshots and backups work together
Modern backup systems used in a production environment often use snapshots as a part of the backup process. This is especially true when backing up an active database. If an active database were simply copied to backup, then the data in the database would likely change before the backup is complete. The resulting backup would be corrupt.
Modern backup systems take a snapshot of a database prior to initiating a backup. The backup then backs up the database as it existed up to the time that the snapshot was created. When the backup process completes, the snapshot is deleted and the data that had been stored in the snapshot is merged into the database.