
Getty Images/iStockphoto
Image-based vs. file-based backup: Key comparisons
Image-based backups protect entire systems with single files, while file-based backups offer granular protection. Most organizations benefit from implementing both approaches.
Backups pose a complex dilemma for any organization. Countless backup methodologies, hardware and software have proven their scalability to meet varied operational requirements. But the trick is selecting an approach that addresses recoveries as quickly and efficiently as possible. Consequently, backup administrators face a choice: image-based backups vs. file-based backups.
Image-based backups create a single backup file that includes the complete contents of an entire system – its operating systems (OSs), applications, data and virtual elements such as virtual machines and containers. An image-based backup is fundamentally a snapshot, or point-in-time (PIT) copy, of the system's state at that moment. Image-based backups provide excellent protection against disasters and relatively fast bare-metal restores of entire systems.
File-based backups offer more granular protection and efficient backups of selected files and targeted folders. As such, they are well-suited for data protection and feature quick recovery of damaged, accidentally deleted or maliciously impacted data. In addition, file-based backup generation uses incremental or differential backup methods. This further reduces the required storage capacity and time—particularly the recovery point objective (RPO)—to create each iterative backup.
Both image- and file-based backup approaches bring value to an enterprise. Image-based backups readily support complex IT environments and protect the business from major events such as disasters, while file-based backups safeguard specific data and enable the business to recover quickly from minor data loss incidents. Often, organizations pair the two approaches in their risk mitigation or data protection strategies.
Importance of choosing the right backup configuration
Although deployable simultaneously, image-based and file-based backups address different concerns that demand specific considerations. Some common criteria when choosing between image- and file-based backups include:
- What is the backup goal? Understanding the business purpose of the backup process helps fit the backup into the organization's strategic risk management and mitigation scheme. Image-based backups protect entire systems and are better for large and all-encompassing issues, such as disaster protection. File-based backups are better for smaller or simpler everyday problems, such as accidental file deletion.
- What content needs protection? If only data requires protection, file-based backups typically work adequately. If the content involves entire software stacks, including OSs, drivers, applications, configuration files and data sets, an image-based backup is probably preferable.
- How much backup storage is required? Every backup requires storage, and large backups often consume many terabytes (TB) of valuable storage capacity. Image-based backups typically demand more storage and quickly multiply storage demands over time. File-based backups usually require far less storage – though large data sets sometimes become extremely large – and further reduce backup time and storage demands with incremental and differential backup techniques.
- How much data loss is tolerable? Data changes rapidly over time, but backups only focus on a single point and only when invoked. Any data changes that occur during the backup are not backed up, resulting in potential data loss. Consider the RPO. Large and complex image-based backups take hours to complete, while file-based backups take mere minutes. Organizations that require low recovery point objectives must carefully weigh the use of image-based backups.
- How quickly must data be recovered? As with backups, recoveries are time-consuming. Large, image-based backups take significant time to restore, resulting in a sizable recovery time objective (RTO). By comparison, restoring relatively small files and folders affords far smaller RTOs. However, organizations that require small RTOs must carefully weigh the implications of image-based backups.
- What security features are needed? Backups include sensitive business information protected, of course, with the organization's prevailing security practices. Neither image-based nor file-based backups are inherently more secure. Still, ensure any backup tool – software or platform – provides encryption and strong authentication for every backup process.
- What are prevailing regulatory obligations? Ensure specific backup needs meet regulatory compliance requirements, business continuity goals and other industry-specific obligations, such as data protection in healthcare. Specific requirements influence the choice of backup methodology. For example, image-based backups performed at less-frequent intervals are usually a better choice to safeguard an organization against malware such as ransomware, particularly when such safeguards are an obligation or standard practice.
Learn the key differences between RPO vs RTO.
Pros and cons of image-based backup
Image-based backups provide users with many benefits, including the following:
- Simplicity. Because an entire system is backed up as one entity, the backup offers a "single backup file" restoration for complete system recovery and full operational status. Nothing else is required.
- Faster system recovery. The single backup file eliminates virtually all guesswork and laborious preparation needed for recovery. This doesn't guarantee shorter recovery time overall—remember, image backup files are sometimes quite large compared to individual data files—but it's far faster than cobbling together and reconfiguring a system from scratch, reinstalling applications, and then attempting data recovery from a backup.
- Consistency. The single backup file's proven consistency ensures that all elements involved—applications, configurations and dependencies—are backed up together in a known-good working state.
- Bare-metal restorations. Since image-based backups typically feature all system components – OS, drivers, applications, configuration settings and all associated data – restoration is available to any suitable hardware platform, all without the need to reinstall or reconfigure the OS or any supporting element.
Despite these benefits, backup administrators must recognize drawbacks to image-based backups. Among them are:
- Larger backups. Single-image backups are quite large – sometimes as large as several terabytes, depending on the size and content of the system. This necessitates significant storage capacity, and the demand for storage only balloons as backups multiply across multiple systems.
- Slower backups. These backup images may lengthen backup time depending on the size, which puts tremendous pressure on RPOs because backup creation itself takes time. For example, if it requires two hours to complete an image-based backup, the minimum practical recovery point objective—and potential for subsequent data loss—is two hours. Shorter RPOs are often unsuitable for regular use with an image-based backup methodology.
- Slower restoration. Restoring a huge backup image file requires significantly more time than restoring file-based backups, though it's still less than rebuilding from scratch. These longer recovery times sometimes exceed the desired RTOs. Again, shorter RTOs often clash with the time required to complete an image-based backup.
- Difficulty restoring individual files. While it's possible to parse an image-based backup to find and restore individual files or folders, any search through the backup image is typically time-consuming and problematic. In many cases, it's best simply to restore the entire image, bringing its other drawbacks to the fore.
Pros and cons of file-based backup
A file-based backup requires tradeoffs of its own, but common benefits include:
- Fast file recovery. Since file-based backups capture individual files and folders, it's relatively simple to locate and restore specific data elements on demand. No attempt is needed to restore an entire system. This makes file-based backups ideal for quick data recovery following an everyday incident, such as accidental deletion or malware, and provides extremely small RTOs.
- Smaller backups. Although files and folders are sometimes quite large, file-based backups present smaller data protection objects. They're far faster to back up and use far less storage than image-based backups.
- Faster backups. Most individual files and folders have less content, which copies to backups much faster than full system images. In addition, file-based backups readily employ incremental or differential backup techniques, which only capture changes to data since the last backup. This further accelerates the file-based backup process and supports extremely small RPOs.
- Cross-platform support. Since these backups only contain files and folders, they support data restores across many different systems because the files and folders are unaffected by system specifics.
Even with the benefits of file-based backups, it's important to remember the common disadvantages of the methodology, including:
- Difficult system recovery. In the event of a major incident, such as a disaster, it's often impossible to recover a complete system from file backups only. File-based backups do not support bare-metal restoration. The extensive time and complexity in rebuilding a system from scratch, reconfiguring OSes and applications and then attempting to restore data from backups typically fail or yield incomplete results.
- Complexity. File-based backups generate a significant number of individual backup files, including full files and incremental or differential additions to full files. Identifying the correct file and then executing proper restoration of that intended file to the right location in production is a confusing process. Sometimes, it results in restoration errors. Backup management is critical when using file-based backups.
Comparing image-based and file-based backups
The right question is not," Which is better, image-based or file-based backup?" Instead, ask," Which is most appropriate for my intended data protection or business goal?" Both backup methodologies bring value to an enterprise, but it's worth delaying the choice until comparing a range of parameters, including:
- What is the backup methodology intended to do?
- What are some common use cases?
- How much backup storage does the methodology require?
- How quickly can a backup be created?
- How quickly can a backup be recovered?
- How often can a backup be created?
- How easily can desired data be extracted from the backup?
These issues are summarized in the table below.
|
Image-based backup |
File-based backup |
Purpose |
Creates a complete PIT image of a system or disk, including OSs, apps and data |
Copies the selected files and folders |
Use cases |
Full system protection against major incidents, such as disasters |
Data protection against common issues such as data deletion, corruption or malware |
Backup storage needs |
Large or complex systems demand extensive storage in the TB range |
Storage varies with file and folder size and often becomes substantial |
Backup performance |
Backups require substantial time because they include all software elements of the system |
Backups are quick for small files, but become substantial for large files and folders, such as data sets |
Recovery performance |
Restoring large images takes substantial time and often has a longer RTO, but it's faster than rebuilding a system from scratch |
Recovery is quick for small files, with a shorter RTO in general, but the process demands more time for larger files and folders |
Backup frequency |
Occurs less often because of the size of the image and longer RPO |
Occurs more often because of the relatively small data sets involved and shorter RPO |
Ease of file recovery |
Extremely difficult, if not impossible |
Easy and intended |
Which backup technology is right for the organization?
Ultimately, the choice between image-based and file-based backups depends on the organization's specific data protection and recovery needs. But there are some common criteria to consider, including:
- Image-based backups are often the better option when protecting complex systems against serious disasters. An image-based backup creates a single file that is readily restorable to any suitable hardware platform, such as bare metal, with proven success.
- File-based backups are usually the better choice when protecting data at the granular file or folder level and when specific data requires quick recovery. File-based backups are better able to protect data against everyday issues, from accidental deletion to corruption due to hardware faults.
However, image-based and file-based backups are not exclusive and readily join to provide comprehensive protection against a broader range of potential incidents. Modern backup tools, especially those employing automated backup capabilities, support both methodologies simultaneously.
Stephen J. Bigelow, senior technology editor at TechTarget, has more than 30 years of technical writing experience in the PC and technology industry.