What is the difference between traditional tiering and sub-LUN tiering?
Tiering means moving data from one kind of storage to another in a manner that best balances the access requirements and business value characteristics of the data. Sub-LUN tiering is less about optimizing storage capacity -- by placing the right bits on the right type of storage -- than it is about optimizing the performance (speed) of data access with the minimum number of energy-consuming disk drives.
To understand developments in tiering, though, we have to go pretty far back.
I call it "tears of storage" these days because, frankly, 30 years ago we had a tiering model that was built right into the operating system of the mainframe. It was called DFHSM. This was IBM's software for doing Hierarchical Storage Management (HSM) as an operating system function. This hierarchal system was a simplified form of tiering. The data, after a certain period of time in accordance with the specified policy, migrated from one kind of storage to another kind to optimize storage use.
But memory was in short supply, so you wanted to get your data out of there quickly. To do so, you bumped it down to a direct access storage device, which was disk. But disk in those days was very big, with very little capacity. If you needed to add more disk arrays, you needed a new building to house them. So, we wanted to move things off disk as quickly as possible and onto tape. Tape was portable, it was very capacious and it didn't cost nearly as much as either memory or disk.
We had a basic model, a hierarchical storage management scheme that was developed as part of the operating system in the IBM world. When many companies abandoned the mainframe, they threw the baby out with the bath water. A rigorous and dependable scheme of storage tiering was lost.
And early on we discovered that slow networks impaired our ability to translate HSM from the mainframe environment into the distributed computing landscape.
Today’s networks have been catching up in terms of speed. That doesn't mean that we're actually doing tiering between different rigs at this point. The vendors have decided, "We're going to sell you a box that has some flash in it, some disk in it -- fast disk with low capacity and slow disk with high capacity. And we're going to tier inside the box, charging the consumer extra for that privilege: both for the software involved and by increasing the costs of all of the disk drives and solid-state drives (SSDs)." Their argument is simple: you pay a premium for on-box tiering because it's all in one place and it's so convenient.
One improvement along the way has been “sub-LUN” tiering. With some storage arrays, automatic tiering works in connection with concurrent and frequent accesses to data bits in the array. If a lot of access is being made to a specific dataset, the data may be copied from disk to SSD for faster response. When access frequency and concurrency diminishes, requests for the data are re-pointed back to the disk copy and the SSD copy is deleted. This strategy does have the merit of improving I/O performance without adding a lot of disk drives to the kit, reducing power consumption as well.
The problem is, when you run out of space using autotiering or sub-LUN tiering, you have to stand up another one. Now you're managing multiple tiering solutions. That's traditional tiering gone crazy.
Frankly, I'd like to see that fixed. I think there are ways to do it. One is to virtualize your storage. I think the most accurate way to describe software-defined storage is that it's the implementation of a software-based virtualization controller over the top of your storage. This kind of virtual controller must be both hardware and software agnostic. We know about hardware agnosticism. We don't care if it's a Hitachi, IBM or Netapp on the outside of the box. We want to be able to migrate data easily between volumes created from the spindles in all these different rigs.
But software agnosticism is also keenly important because now you hear all the virtualization software vendors in the server space trying to get into the storage virtualization game. In essence, they want to own the storage. They want to own the network. Just as they now own the server box, both in terms of physical gear and software.
Vendors like VMware are doing this in cahoots with their partners -- especially EMC. Dell is trying to come up with its own model with all the different hardware and software layers in their catalog. HP of course, is trying to drive everything down to their servers, switches and 3PAR storage rigs.
Dig Deeper on Storage architecture and strategy
Related Q&A from Jon Toigo
Cache memory and RAM both place data closer to the processor to reduce latency in response times. Learn why one can be faster, along with other key ... Continue Reading
Although software-defined storage and object storage can work in similar ways, there are significant differences between the two. Continue Reading
Jon Toigo looks at the factors that determine whether disk or tape is more energy-efficient. Continue Reading