kantver - Fotolia


Deconstructing disk capacity using VM thin provisioning

Although thin provisioning promises to reduce the disk footprint of VMs, it can lead to an increase in disk capacity usage, making capacity monitoring absolutely essential.

Thin provisioning has been a common part of storage systems for many years and part of vSphere for about just as long. Thin provisioning is often viewed as a way to reduce the disk footprint of VMs and, consequently, reduce the amount of storage that must be bought. Although it's true that in the first year of VM thin provisioning, it will use less disk capacity, there's no guarantee your thin provisioned VMs will need less disk capacity over time. In fact, the flexibility that thin provisioning provides may lead your VMs to use more disk capacity than if they were not thin provisioned.

The general principle of VM thin provisioning is that you only commit disk space for the VM when the VM actually writes data. A thin provisioned VM with 800 GB of disk space available to the guest OS may only use 160 GB of physical storage. The other alternative is thick provisioning, where all of the disk capacity exposed to the VM guest is allocated. A thick-provisioned VM with 800 GB of disk space will use all 800 GB from physical storage. Thick-provisioned VMs use the same amount of physical disk capacity from the moment they are created. The only way for a thick-provisioned VM to use more physical disk is if you provision more capacity to the VM guest. A thin provisioned VM will use more physical disk capacity as it writes more data to disk. Thin provisioned VMs grow over time. The guest OS in the VM is unaware of thick or thin provisioning; it always sees the provisioned space.

There would still only be 20 GB of data on the disk, but the disk file would consume 500 GB of physical disk space.

One of the key things to know is that the current disk space used inside a VM disk does not indicate the actual size of the virtual disk file, nor the physical disk usage. Most guest OSes do not reclaim the disk space used by deleted files, so creating a file in the guest OS will make the disk file grow, but deleting the file will not make the disk file shrink. Creating the same file again, or a new file, will not use the freed space, so the disk file will grow again. The result is that most thin provisioned disk files are larger than the data size of the guest OS. This is particularly true for disk files that contain database logs, where the log files are deleted after a backup. There may only be 20 GB of log files at any one time, but over the lifetime of the VM there may have been hundreds of gigabytes of log files created and then deleted. This is also true for file servers, where files are frequently created and deleted, or simply moved from one directory on a file share to another. Even when the OS frees the disk blocks of deleted files, vSphere has no practical method to reclaim the freed space from the physical disk.

One of the benefits of thin provisioned disks is the ability to allocate a lot of capacity to a VM and not have to worry about the space filling up as much as if we allocated a smaller disk. This tendency to allocate larger disks is what can get us into trouble. Think about a database server that creates 20 GB of log files every week. If we allocated a 100 GB disk to it, then there would be plenty of space for logs, even if a backup were to fail occasionally, and we'd end up with 40 GB of log files. If the disk was thin provisioned, it would take about five weeks to expand to its provisioned size, and then it would stop growing. New log files would be created on top of deleted files because the guest OS has used all the disk blocks. But we might choose to create the log disk larger, because it is thin provisioned. If we made it 500 GB provisioned, then it would take 25 weeks -- since 25 times 20 GB equals 500 GB -- to reach its provisioned size. There would still only be 20 GB of data on the disk, but the disk file would consume 500 GB of physical disk space. Since we provisioned a larger disk, it can and will eventually grow larger. Most thin provisioned disks will grow and many will end up occupying their entire provisioned size. Be careful not to overextend disk capacity using VM thin provisioning, as many will end up using everything you provision.

Because thin provisioned VMs grow, it is important to view thin provisioning as a way to defer storage purchasing, not reduce it altogether. In the first year, VM thin provisioning means less physical disk capacity will be used than if they were thick-provisioned. However, over the years the difference will reduce, and you will need more physical capacity in the second and subsequent years. Running out of physical disk capacity for thin provisioned VMs is bad -- really bad. Make sure to monitor usage and buy more physical capacity before you need it. Plan and budget for more disk capacity purchasing and make sure your array can handle more physical disks.

I've talked about VMs being thin provisioned at the vSphere level on the data store. It is also possible for the storage array to be thinly provisioned under the data store. Array thin provisioning has the same net effect, but for every disk on every VM on the thin provisioned data stores. The major difference is that vSphere is likely unaware of the array being thin provisioned, just as the VM OS is unaware of the thin provisioning. The one storage technology that can help you here is deduplication, but that is its own complicated technology.

Next Steps

Boost all-flash array efficiency with thin provisioning

Thin provisioning hypervisors and storage arrays

Converting thick to thin provisioned VMDKs

Dig Deeper on VMware ESXi, vSphere and vCenter

Virtual Desktop
Data Center
Cloud Computing