Kit Wai Chan - Fotolia
- George Crump, Storage Switzerland
High-performance storage systems have been with us for a while. In the 1990s, these systems were DRAM-based and primarily used to accelerate transaction-oriented databases. The expense of these systems made them hard to justify and limited their use to applications where massive performance acceleration would make the organization more money.
Times have changed. High-performance flash data storage systems have become easier to justify as costs have come down, moving them into the enterprise and expanding their adoption. Today, flash is breaking new thresholds in price per gigabyte and, as a result, use cases are changing once again.
Flash is the workhorse
You used to only deploy flash data storage systems, like their dynamic RAM (DRAM) brethren, for applications that could take advantage of their performance. But flash is now the go-to media for primary data storage. Any application or data set that can justify the use of primary storage can also justify the use of flash.
The next step in flash adoption will drive the technology in three directions. The most surprising is into secondary storage, such as archive and backup. Another is into even faster performance than what is available today. The third is the use of flash as a replacement for DRAM.
Flash for secondary storage
Five years ago, using flash within a secondary storage system was unheard of because of the cost. That's changing with continuing price erosion and greater density. Today, storage system vendors can provide petabytes of storage in a 2U to 3U package. As the cost of building and powering new data centers becomes more challenging, the thought of squeezing more capacity into the same space becomes more compelling, even if it is a little more expensive.
At the same time, the performance advantages aren't lost on the markets that secondary storage serves. A big data analytics project that can store all of its data on flash can generate results faster across a much wider set of data. An archive that holds many years' worth of information, yet can respond to a request for data almost instantly, is hard to resist, especially in a world filled with impatient users.
This ability to respond instantly to unpredictable access requests is critical to industries such as media and entertainment, which used to distribute its content once to an unlimited number of destinations, whether they wanted it or not -- think broadcast TV. It is now an industry that operates almost totally on-demand, waiting for a particular user to request a piece of content -- think Netflix.
Do tape and HDDs have a future?
As flash continues its march to domination of the data storage industry, you have to wonder what place hard drives and tape systems have in the modern data center. Of the two technologies, tape may be best insulated from extinction.
Ignoring cost per gigabyte, tape has some unique properties:
- Tape is designed to be offline. In our totally connected world, data is under constant attack from hackers and always exposed to rogue employees. A disconnected copy is immune from those attacks.
- Tape is portable. Its high capacity per cartridge -- 32 TB in the next generation -- and its ruggedized construction mean an entire data center can be shipped via overnight truck, outpacing the bandwidth of the fastest WAN connection.
Hard disk-based systems are more vulnerable, especially as the density per flash device increases. While 10 TB HDDs are available and 20 TB drives will be available soon, the expectation is that 32 TB flash data storage devices will ship sooner. Also, to get these HDD capacities, hard drive vendors have had to make compromises in the way that data is written and read. And hard drives are not designed to be taken offline and are not nearly as portable as tape media.
Hard drives still provide a price-per-gigabyte advantage over flash, and tape holds an advantage over both. It is reasonable to assume that flash will continue to narrow this gap. If price per gigabyte is the only advantage HDDs have, then the technology is clearly at risk.
Another surprising use case for flash data storage is backup and data protection. A backup software product is dependent on its database, which tracks all sorts of metadata. The speed of flash allows it to add data faster and to respond to user search requests instantly. But the big shift in data protection is recovery in place. Most backup applications can host a virtual machine's data store directly on the backup target. Suddenly, backup storage just became primary storage. If that primary storage is full of high-capacity, compressed and deduplicated HDDs, the performance of that data store will be a problem. This is forcing some vendors to turn off these features in case a customer wants to use the recovery-in-place feature.
A flash-based backup storage system makes recovery in place a viable method to return an application to service at a performance point that meets the user's expectations. Remember, prior to failure, these users had grown accustomed to flash performance in the primary storage system. They may be unhappy to compromise performance even in the recovered state.
Higher performance flash
As fast as flash systems are, there are applications that can use even more speed. Besides, the excess in performance that flash delivers to enterprises today is likely short-lived. Eventually, application developers will catch up and create applications that will require more performance than what current flash systems can deliver.
Most performance issues surrounding flash implementations have more to do with the package surrounding the media than the media itself. For a flash-based storage system, we're talking about the internal connectivity between the flash media and the internal CPUs running the storage software. The latency between these connections and the quality of the software are the big challenges.
The data storage industry developed nonvolatile memory express (NVMe) to specifically solve this problem. NVMe is the next step in storage protocols that enable the operating software to talk to media. It is designed specifically for flash, whereas the SCSI protocol it replaces was designed in an era of HDDs.
NVMe reduces the unnecessary overhead in the SCSI stack. It supports more queues than standard SCSI, increasing the number of queues to 64,000 from the one that the legacy Advanced Host Controller Interface (AHCI) supports. Each NVMe queue can also support 64,000 commands, instead of the 32 that AHCI supports in its single queue. Also, NVMe can do more per-CPU cycle than standard SCSI can.
At some point, data has to leave the storage system and communicate with the applications accessing it. This is another area where latency reduction is critical, and NVMe helps with NVMe over Fabrics and works on Fibre Channel (FC) and Ethernet networks. Today, iSCSI and FC protocols transport SCSI. That means no matter how fast they become from a bandwidth perspective, they are burdened by SCSI's single-threaded nature. NVMe allows them to leverage a greater number of queues and commands, essentially optimizing these faster networks.
Flash as RAM
Flash started life as a more expensive, but faster alternative to HDDs. Now, it is ready to evolve into a slower, but less expensive alternative to system RAM. The need for RAM in servers is being driven by in-memory databases, big data analytics processing, and highly dense virtualized and containerized environments. The problem is that RAM is expensive, and most servers have constraints on how much RAM an IT department can install per server. Many of these in-memory database environments buy additional servers not because they need more compute power, but because they need more memory.
Flash data storage as RAM is essentially flash installed on a DIMM module, and it is designed to go into a system motherboard. Because the density of flash is higher than that of DRAM, significantly more capacity can be added to the server than ever before. Because the flash DIMM is installed in the memory bus, it has a high-speed network that allows it to communicate with the CPU. The flash DIMM driver will manage the movement of data from flash DIMMs to DRAM DIMMs automatically. Essentially, it creates a tiering mechanism for memory.
Because flash DIMMs require an updated ROM BIOS, IT professionals must verify which server vendors support which, if any, flash DIMMs.
Key drivers for the data center remained unchanged for decades. Organizations now want data to respond faster, and they want to store more and more of it. Storage systems need to become continually faster and denser to keep up. The most unpredictable change in flash storage is its movement into an area of the market that is dominated by hard disk and tape systems. More predictably, there is a constant need for more performance. The adoption of protocols such as NVMe, as well as memory bus-based flash DIMMs, will ensure flash technology keeps pace with user demands.
Flash for primary storage is a growing option in data centers
Vendors need to improve pace of flash systems
Pros and cons of server-side flash