Storage system design has always entailed a tradeoff between three parameters: capacity, throughput and IOPS. Unfortunately, for systems engineers, the physical limitations of storage components don't allow them to be set independently -- at least, not in hardware.
Parametric dependency forces designers to choose between capacity and performance. Need more throughput? Throw more spindles or SSD controllers at the problem and live with the unused capacity.
Workloads that require high I/O throughput for tiny data sets highlight the provisioned IOPS-versus-capacity conundrum. As one example, a comment on the Azure Feedback portal illustrates the problem of not being able to provision performance independently of capacity for Azure managed disks.
Software-defined storage (SDS) can decouple these parameters by placing a logical abstraction layer between storage resources and components. Furthermore, a centralized software control plane enables cloud storage services to granularly carve up capacity by spreading logical block volumes and file shares across storage nodes and drives.
Even though hyperscale, distributed storage systems are amenable to independent optimization, as detailed below, only a few services allow this tradeoff, likely due to the high cost of providing them and sparse demand.
Origins of the problem and early solutions
The link between IOPS and capacity stems from the mechanical limitations of spinning magnetic platters and hard disk heads that left only four ways to increase IOPS:
- Faster rotational speed
- Higher-density magnetic media
- More read-write heads
- Larger RAM caches
Although SSDs eliminate the mechanical constraints on throughput and I/O, they have other limitations, including:
- The speed of the storage cell read and write operations.
- Large memory blocks in NAND flash, which cause write amplification and access latency.
- Throughput of the drive controller gated by the speed of embedded microcontroller unit and memory buffers, NAND I/O and SATA interfaces.
These place an upper limit on SSD throughput and IOPS, particularly for random writes, which can have 10 times the latency of sequential writes.
The traditional method of feeding high-IOPS workloads is to spread storage volumes across as many devices as needed in a RAID and to add larger RAM caches as an I/O buffer. The first tactic results in unused capacity, while the second adds significant cost.
Use SDS, cloud services to provision IOPS independently
Perhaps the first product to disconnect storage capacity from throughput came from SolidFire, a company acquired by NetApp in 2015. SolidFire pioneered a quality of service (QoS) feature that enforced minimum, maximum and burst levels for throughput and IOPS. As noted in an early paper describing the architecture, SolidFire allocated performance and capacity independently for each volume in a system. The company offered few details of the internals, but in its scale-out system, each 1U node was part of a distributed controller connected via a dedicated high-speed back-end network. The software could transparently carve up capacity across as many nodes as needed to meet QoS guarantees.
Although cloud providers are famously secretive about the physical hardware and provisioning and management software that underpins their services, they take SolidFire's approach of scale-out arrays with distributed controllers to rack- and pod-scale. Providers typically deploy hundreds of identical storage servers that are aggregated into resource pools for storage services of different categories -- such as block, file and object -- and performance levels. For example, Amazon Elastic Block Store (EBS) comes in multiple varieties, including general-purpose SSD, called gp2 and gp3, and provisioned IOPS SSD, called io1 and io2.
Typically, cloud storage products offer IOPS tiers with different capacity limits, both minimum and maximum. In contrast, the gp3 instances, which were unveiled at re:Invent 2020, are unusual in that they enable users to independently increase throughput and IOPS without having to provision more block storage capacity.
The gp2 and gp3 instances provide soft caps on IOPS performance and, according to Amazon, deliver within 10% of the provisioned IOPS performance 99% of the time in a given year. Furthermore, gp2 volumes under 1,000 GB have a burst performance up to 3,000 IOPS for at least 30 minutes, while gp3 volumes provide a minimum of 3,000 IOPS with no burst capability. In contrast, provisioned IOPS io1 and io2 volumes, according to Amazon, deliver within 10% of the provisioned IOPS performance 99.9% of the time in a year, with sub-10 ms latency. When paired with AMD-based r5 Elastic Compute Cloud instances, io2 volumes can provide up to 260,000 IOPS for volumes as small as 4 GB.
Several AWS competitors offer similar flexibility to provision IOPS independently from capacity. These include:
- Azure Ultra disks, which come in several fixed sizes from 4 gibibytes (GiB) to 64 tebibytes. Users can set limits up to 300 IOPS per GiB, with a maximum of 160,000 IOPS per disk. Thus, a 32 GiB volume can be configured with 100 to 9,600 IOPS.
- IBM Cloud adjustable IOPS, which supports nondisruptive, dynamic adjustment of IOPS capacity within the limits of its two service tiers. Endurance volumes support IOPS settings greater than 0.25 IOPS per GB, while Performance -- or allocated IOPS -- volumes support anything between 100 and 48,000 IOPS.
In contrast, most cloud services linearly scale IOPS performance with volume size. For example, Google Cloud Platform (GCP) SSD Persistent Disks provide 30 read and write IOPS per gigabyte to a maximum of 60,000 read IOPS, depending on the number of attached vCPUs, block size and other parameters. Similarly, Oracle Cloud Infrastructure (OCI) Block Higher Performance Volumes provide 75 IOPS per gigabyte (4,000 blocks) up to 35,000 IOPS per volume. Thus, if an application only requires a 128 GB volume, GCP SSDs can provide 3,840 IOPS, OCI Higher Performance can deliver 9,600 IOPS, while Amazon EBS io2 can be configured as high as 64,000 IOPS.
Choosing a service
Most cloud storage services and array vendors scale IOPS with capacity because the technical limitations of drive and controller technology require scaling out devices to deliver more throughput and IOPS capacity. However, for some applications, the working set and I/O throughput requirements don't linearly scale.
For example, as AWS points out in its blog about EBS gp3 volumes, some apps, such as MySQL and Hadoop, require high performance but not high storage capacity. Similarly, microservices-based applications that might have small working sets with many transactions to a shared storage pool can be accelerated by reducing storage latency and increasing IOPS. In these cases, a cloud service like EBS io2 or gp3 or a storage product like NetApp SolidFire, which doesn't couple faster IOPS to larger volume sizes, will deliver better performance for the money.