VirtualBox VM recovery: Two ways to salvage your data
If you have to recover a VirtualBox VM, you're in for a grand old time. Keep these tricks in mind to help get your data back.
Recently, I had the pleasure of working with a VirtualBox VM that decided to crash and remain unbootable. When something like this happens, there are a few disaster recovery methods for VirtualBox at your disposal.
Oracle VM VirtualBox is a hosted platform that can be used for server or desktop virtualization. I was using VirtualBox to host virtual desktops, and unfortunately, the data was hosted on the system's virtual machines (VMs) rather than externally. If you're using VirtualBox for desktop virtualization, you need to know about these VM recovery methods that could save your data.
VM recovery approaches
With my mission-critical data on the VM's virtual hard disk (VHD), the prospect of redoing a lot of work just to re-create the data wasn't appealing. The best thing to do was to recover the VHD, pull what I needed out of there and cut my losses.
There are two basic VirtualBox VM recovery options in this situation:
- Boot the VM using some kind of recovery media, access the VHD and copy whatever data you need to another VHD or across the network to a share.
- Attach the VHD to another VM as a secondary drive, boot this other VM, and recover the data from there.
The first approach is slightly safer because you minimize the number of changes being made to both the VHD and its VM. The second also works, but I'm a bigger fan of the first because there's less chance for you to mess things up.
As you can imagine, the devil is in the details. As I began my own VirtualBox VM recovery, I realized there were more details than I initially thought. Below are some challenges you might encounter with these disaster recovery methods.
Where's the VHD?
First, you need to know what kind of disk image you're dealing with and on which VM it's located. Oracle VM VirtualBox can create several different kinds of VHD images, and the differences between them affect your VM recovery.
More VirtualBox VM resources
Creating virtual machines with Oracle VM VirtualBox
VMware Workstation vs. VirtualBox
Setting up VirtualBox remote display access
Most of the time, VHDs are classified as "normal," which means only one VM at a time can use the disk in question, and the source of the damage is typically that single VM. A drive marked as "shareable" that multiple machines use at once may have been damaged by any of the VMs attached to it, which makes troubleshooting harder. ("Write-through" disks are like normal disks, but with no snapshotting.)
In both of these cases, make sure to completely power off all VMs that have been using the corrupted drive -- don't just suspend! Then, perform the recovery from one VirtualBox VM, preferably the one with the most memory.
More than one VM at a time can use a "multi-attach" disk, but each VM has its own separate differencing image. This format is useful if you have one disk image that you want to use as a master between multiple VMs. But it means VirtualBox disaster recovery has to be performed from the VM where the data in question was being used, or the VirtualBox VM won't see it. If you don't know which VM that is, you might have to do some spelunking on each machine.
A note about the recovery environment: If possible, you should be able to mount the VHD in question by the host OS as read-only by default. (Windows does not do this well, unfortunately.)
Once you've figured out which VM and VHD to recover your data from, the next step is to prepare a place for the data.
Network vs. local-disk VM recovery
One of the obvious ways to save the data is to create a new, blank VHD, attach it to the VM, boot to some kind of recovery media and copy everything out. This VirtualBox VM recovery method is simple: All you have to do is mount and format the new media. But it also creates an extra step: If you need the data to be on the host rather than another VHD, you have to copy it out again afterward.
You can save yourself a step by instead connecting to a network share -- one shared out from your host -- and copying the data out across that share. There are a few possible hitches with this approach, though.
First, if the VirtualBox VM you're using is not configured to talk to the network or is only on a local network that doesn't speak to the host itself, you'll have to reconfigure it. This only requires changing the VM hardware settings, not the OS, since you're not actually going to boot that OS during the recovery.
Second, you need to make sure the recovery system has network access. You might need to change the network adapter type, for instance, if the adapter in the VM isn't recognized without additional drivers.
Last but not least, the copying speed across the network, depending on the target, may be problematic. If you're using a network that just links back to the host, the amount of data being copied shouldn't matter. But if you're connecting to something that's on an actual network with latency and bandwidth constraints, copying data will be much slower. (Copying a lot of data to a locally attached VHD can be just as slow, especially if the disk image you're copying to is on the same spindle as the one you're copying from. That's something else to watch out for.)
Bypassing the VirtualBox VM entirely
Yet another approach is to skip involving VirtualBox entirely, mount the hard disk directly in the host OS, and copy out the files that way. The exact method varies, depending on the host OS, however. There are native ways to do this in Linux using libguestfs, for instance.
For Windows, there is a program called WinMount that opens .VDI format files and makes them available through Explorer as just another drive letter. Files in the .VHD format can also be unpacked using the 7-Zip archiving tool, which is free and open source. I've used both tools with good results.
This VirtualBox VM recovery method removes the middle man of the VM entirely and allows you to manipulate the file on your own. But it also has a few gotchas:
- It helps to do a clean shutdown of the VM. If your VM crashed and left the disk in an inconsistent state, you may want to boot another instance of the same OS with that VM (perhaps via a .ISO image), perform a disk check and a clean shutdown, then attempt an offline recovery.
- These tools are all third-party items, so there's no guarantee of consistent functionality. They should work in most cases, but the advantage of using VirtualBox and its VMs to recover the data (especially for a .VDI file) is that there's more guarantee the disk will be mounted and read successfully in the first place.
Performing a VirtualBox VM recovery is nobody's idea of fun, and it's not something you're necessarily equipped to do just from having run VirtualBox without incident. Before your VM goes down, get familiar with these disaster recovery methods.