In this 2012 Storage Decisions Chicago presentation, Jon Toigo, CEO and managing principal of Toigo Partners International, and chairman of the Data Management Institute, discusses the Linear Tape File System and solid-state drives, two technologies that can provide next-generation storage performance at a fraction of the power consumption seen in data centers today. Watch the video above or read the transcript below to learn more.
The most important metric you're likely to confront in the next 10 to 20 years is the cost and availability of electricity. When you're building the capture storage pool, which is also performance storage, you emphasize IOPS per watt. You start looking for storage arrays that don't consume a lot of electricity and also deliver extremely high performance to control your data center power costs.
We had an example of a storage array that was sitting on our floor (I won't say which one, but it was 3PAR) that just broke a land speed record for storage performance: 410,000 IOPS out of the crate. But they used 1,900 disk drives to get there. They could have made it even faster if the Storage Performance Council allowed them to short stroke the disk -- use just the outermost tracks so that the read-write head doesn't have to move -- and they could have doubled up on the number of disks on the array. [By doing that] they probably could have beaten 800,000 IOPS, but as I said, the Storage Performance Council won't let you do the test with short-stroked disks. But X-IO, which has these little bricks with 16 drives and [solid-state drive] SSD memory in them, just announced they were blowing the socks off that 3PAR rig.
Why are we even using disk for retention storage? We could use NAS on steroids, or tape NAS. I'm thinking that just the cost and availability of energy will nudge us in the direction of using energy-centric metrics to build our data storage infrastructure. You don't have to care about the fate of Planet Earth to see the wisdom in doing this.
SSD optimization [can save power too], which is what I was just describing with X-IO. Right now, if you're using a whole bunch of short-stroked spindles, you're spending a fortune in electrical power to get the IOPS for your applications. So you buy a rig that has hundreds or thousands of drives in it, you short stroke across all the spindles and you get high IOPS. That's the way we've always done it. It's also the wrong way because it's a huge aggregated power demand on your data center.
The newer approach is to take a disk drive and an SSD and stick them in the same cabinet. Basically, you're using fewer drives than you were before and getting exactly the same IOPS. Let's say you have a whole bunch of users [using that data]. You write data to the disk and then that data gets hot -- there are a lot of concurrent, simultaneous accesses being made to the data. So temporarily, that data gets moved into an SSD flash card. Now it's being serviced at 20,000 IOPS. That's the standard rating of flash, and all these users are using it. Over time, the users begin to disappear and the data gets cool. You repoint all the access back to the disk, and basically you live happily ever after. The cost of energy on this is a fraction of what the cost was to get the performance on the many hundreds of spindles. It's the same or better IOPS with significantly lower power requirements. Have you thought about that? Has that even entered into your vocabulary yet? Because it will.
Tape NAS: LTFS
Tape NAS is your capacity storage play. Right now we field lots and lots of spindles and say we may be able to use disk for long-term file storage because we can spin down those drives. That's fine; it's not quite into the red like the high-performance disk is, but the truth is you're still consuming quite a lot of electricity. Here's another solution: Take your capture storage, and set up a generic server with a little bit of caching disk in it for prefetch and a tape library behind it. Now you have 196 [petabytes] of storage on two floor tiles, consuming about two lightbulbs worth of electricity. And you're running something called the 'Linear Tape File System (LTFS),' which is free. You can download it from both IBM and Ultrium.com, the home of the Linear Tape Open initiative.
Users see the data stored on that repository like any other NAS device. How long does it take you to access the data? Between 20 seconds and two minutes, depending on whether the tape is preloaded or you need to fetch it. There's a front end to this that's created by Crossroads, and when you write data to the tape, it takes the first 10 MB of the data and stores it on disk so that when you transfer the file, you're fetching data off the disk and it masks the amount of time it takes for the robot arm to pull the tape, load it and find the start point. That's been possible since LTO-5 was available. That's the reason why some people have been saying, "Well, if LTFS is so good, then why has it taken them two years to do anything with it?" Because LTFS existed before we had partitioned tape. Partitioned tape in the LTO family only appeared in LTO-5. Partitioned means that only one partition is used to hold all the data; the other partition has the start and stop points for all the files stored on the tape, plus all the other metadata associated with it.
Is it perfect? No. Will it improve over time? I'm very sure it will. We've seen the broadcast industry adopt it. They're blending LTFS with media management systems. They have what they call their own 'media assets' and programs called 'media asset managers' that sit right on top of LTFS. [For example], they can go directly to the inning where a player on the Chicago Cubs hit a home run, find the clip they want and then deliver it for broadcast. But they're used to working with tape, which is why they were early adopters.