https://www.techtarget.com/searchstorage/definition/storage-virtualization
Storage virtualization is the pooling of physical storage from multiple storage devices into what appears to be a single storage device or pool of available storage capacity. A central console is used to manage the storage.
Storage virtualization or virtualized storage aims to abstract physical storage systems and drives in order to present them as a single pool of storage capacity. The capacity of this single virtual device can be centrally managed, thus simplifying storage allocation, maintenance and overall management.
Storage virtualization disguises the actual complexity of a storage system, such as a storage area network (SAN), which helps a storage administrator perform the tasks of backup, data archiving and recovery more easily and in less time. The virtualization software used intercepts input/output (I/O) requests from physical or virtual machines (VMs) and sends those requests to the appropriate physical location of the storage devices that are part of the overall pool of storage in the virtualized environment. To a user, however, the various storage resources that make up the virtualized pool are unseen, so the virtual storage appears like a single physical drive, share or logical unit number (LUN) that can accept standard reads and writes.
The virtualization and centralization capabilities make the overall approach different from bare metal storage systems where physical storage devices must be addressed directly. This is also why virtualization offers significant operational efficiencies over bare-metal provisioning of storage. Additionally, by allowing IT teams to address a single device as opposed to many, storage virtualization improves the performance of storage environments and minimizes compatibility and security issues.
To build a virtualized storage environment, multiple physical storage devices are grouped so that they use a single server. The server is assigned virtual storage blocks that can redirect the I/O traffic. The devices are divided into small blocks of data (LUNs). They are then presented to remote servers as a virtual disk. However, the servers see the LUNs as physical disks. A software virtualization layer separates the storage hardware from the virtual volume. This makes it possible for the operating systems (OSes) and applications to access and use the storage.
Storage virtualization technology relies on software to identify available storage capacity from physical devices, to create a barrier between the physical and virtual storage devices and to then aggregate the available capacity as a pool of storage that can be used by traditional architecture servers or in a virtual environment by VMs. In addition to identifying and compiling the available storage capacity, the software makes the capacity available to various applications to use.
To provide access to the data stored on the physical storage devices, the virtualization software needs to either create a map using metadata or use an algorithm to dynamically locate the data faster or on the fly. The software intercepts read and write requests from applications. Using the map it has created, it can find or save the data to the appropriate physical device. This process is similar to the method used by OSes when retrieving or saving application data.
A redundant array of independent disks or RAID array can sometimes be considered a type of storage virtualization. Multiple physical drives in the array are presented to the user as a single storage device that, in the background, stripes and replicates data to multiple disks to improve I/O performance and protect data in case a single drive fails.
Some of the benefits and uses of storage virtualization include the following:
When first introduced more than two decades ago, storage virtualization tended to be difficult to implement. It also had limited applicability. Also, because it was originally host-based, virtualization software had to be installed and maintained on all servers needing access to the pooled storage resources.
Storage virtualization could also create compatibility and interoperability issues. For example, the virtualization environment might not be fully compatible with protocols like Network File System (NFS), or it might not integrate with the automation tools, OSes or hypervisors used by an organization. This could lead to operational disruptions. It could also necessitate additional purchases to facilitate integration, orchestration and interoperability between the virtualization environment and the existing IT infrastructure.
Another potential issue was related to performance. Some virtual environments have high latency and, therefore, cannot meet the performance requirements of certain applications. Admins needed to consider many aspects, including storage controller capabilities and caching mechanisms to minimize the impact of virtualization on performance.
Data security was another concern that hindered the adoption of storage virtualization. If the virtualized environment does not support data encryption or does not provide strong authentication/access controls, it puts the security and integrity of data (at-rest and in-transit) at risk. To protect their data in a virtualized storage environment, organizations need to implement these measures, as well as effective data backup procedures.
Fortunately, many of these drawbacks have already been addressed or minimized. As virtualization technology has matured, organizations are able to implement it for many different use cases. Also, they can choose from multiple virtualization methods and select the method that makes the most operational and financial sense for their existing infrastructure and IT requirements.
Developments in virtualization software have also made it easier to deploy storage virtualization in different environments. Also, the emergence of standards such as the Storage Management Initiative Specification enables virtualization products to work with a wider variety of storage systems. For these reasons, virtualization is an attractive option for enterprises looking to increase storage capacities and simplify storage management, while controlling storage costs.
There are two basic methods of virtualizing storage: file-based and block-based.
File-based storage virtualization. File-based storage virtualization is applied to NAS systems. Using Server Message Block (SMB) in Windows server environments or NFS protocols for Linux systems, file-based storage virtualization breaks the dependency in a normal NAS array between the data being accessed and the location of physical memory.
The pooling of NAS resources makes it easier to handle file migrations in the background, which will help improve performance. Typically, NAS systems are not that complex to manage, but storage virtualization further simplifies their management through a single management console.
Block-based storage virtualization. In block-based storage virtualization, the virtualization management software collects the capacity of the available blocks of storage space across all virtualized arrays. It pools them into a shared resource to be assigned to any number of VMs, bare-metal servers or containers.
The storage resources are typically accessed via a Fiber Channel (FC) or Internet Small Computer System Interface (iSCSI) SAN. Block-based systems abstract the logical storage, such as a drive partition, from the actual physical memory blocks in a storage device, such as a hard disk drive (HDD) or solid-state memory device (SSD). Because it operates in a similar fashion to the native drive software, there's less overhead for read and write processes, so block storage systems perform better than file-based systems.
Notwithstanding the benefits of SANs, managing SANs can be a time-consuming process. Consolidating multiple block storage systems under a single management interface that often shields users from the tedious steps of LUN configuration, for example, can be a significant timesaver. Block-based virtualization is also known as block access storage.
There are generally two types of virtualization that can be applied to a storage infrastructure:
Although waning as a backup target media, tape storage is still widely used for archiving infrequently accessed data. Archival data tends to be voluminous; tape media can employ storage virtualization to make it easier to manage large data stores.
Linear tape file system (LTFS) is a form of tape virtualization that makes a tape look like a typical NAS file storage device. It makes it much easier to find and restore data from tape using a file-level directory of the tape's contents.
There are multiple approaches to storage virtualization:
In addition to the above, storage can also be applied to a virtual environment via OS-level or file-system virtualization. With the former, the OS includes features that allow for the creation of tiered storage. The latter refers to using technologies that provide users with a consolidated view of file data even though those files might be scattered on many different file servers. Users might also be able to access the files remotely due to the file replication capability provided by the file-system virtualization technology.
In the late 1960s and early 1970s, IBM developed the concept of virtualization in the context of time-sharing for mainframe computers -- the idea that multiple users could share the usage of expensive mainframe devices without having to purchase or lease them. This approach helped to reduce the cost of providing computing capabilities, and allowed more users and organizations to use those capabilities in a cost-effective manner. Similar potential benefits drove the development of storage virtualization technology and solutions.
IBM SAN Volume Controller was an early version of a block-based virtualization appliance. Now called the IBM Spectrum Virtualize, the appliance supports large-scale workloads and enables hybrid cloud storage deployments for 500-plus supported storage systems. The Spectrum Virtualize software provides insulation from physical storage and can be used in the appliance along with other server virtualization and containerization technologies.
Another early storage virtualization product was Hitachi Data Systems' TagmaStore Universal Storage Platform. That product evolved into Hitachi Vantara's Virtual Storage Platform One (VSP One) which offers virtualization and aggregation so organizations can create large-scale storage pools and then logically partition them to optimize application quality of service. The platform also reduces storage-management complexity and offers high configuration flexibility.
In the late 1990s, VMware released the VMware Workstation, a virtualization product that included a hypervisor to help IT admins set up VMs on a single machine running either Linux or Windows (x86) OSes. The hypervisor enables organizations to simultaneously run multiple applications on a single piece of hardware, thus simplifying hardware management and also reducing costs. Advanced hypervisors include features like fault tolerance and high availability to reduce the likelihood of downtime events and minimize the impact of these events on business continuity and productivity.
From the 2000s onwards, many more companies entered the virtualization space, including Microsoft, Red Hat and Citrix Systems. Today, many enterprise data centers use the virtualization techniques and solutions developed by these organizations to create large aggregated pools of storage and other resources and offer those resources to the organization as agile and scalable VMs.
Storage virtualization today usually refers to capacity that is accumulated from multiple physical devices and then made available to be reallocated in a virtualized environment. Modern IT methodologies, such as hyperconverged infrastructure and containerization, take advantage of virtual storage, in addition to virtual compute power and often virtual network capacity.
Edge computing also relies on storage virtualization. Virtualization allows organizations to meet their storage requirements and simplify storage management and maintenance in edge computing environments. Also, virtualized storage environments are more compact than physical environments and require less hardware and management resources. All of this can deliver big cost efficiencies and also benefit organizations with limited space and smaller IT teams.
Although storage virtualization is by no means extinct, it is largely overshadowed by cloud computing. In this new computing paradigm, organizations determine the amount and type of storage they need. The cloud service provider (CSP) then configures and provisions this storage from their virtualized storage pools and makes it available to the organization on-demand. With cloud-based virtualized storage, organizations can access the storage resources they need without having to worry about various storage management tasks. Furthermore, since the CSP provides the resources on a "pay as you go" basis, the business can control its costs, and, in many cases, achieve faster time to value.
Virtualization refers to full-scale virtualization; paravirtualization is a different approach involving partial virtualization. Learn the differences between virtualization and paravirtualization, and explore their advantages and disadvantages. Also, read more about the history and development of virtualization technology.
24 Feb 2025