The benefits of clustered storage

Clustered storage combines multiple arrays or controllers to increase their performance, capacity or reliability. But the technology isn't right for every company. We outline what you need to know before deciding to adopt clustered storage.

Clustered storage promises better performance, scalability and reliability, but it's not designed to fit the needs of every storage environment.

Clustered storage combines multiple arrays or controllers to increase their performance, capacity or reliability. In many cases, it's a cost-effective way to meet today's storage needs. But clustering isn't right for everyone.

Before choosing whether or how to adopt clustered storage, storage managers should understand their business and data access requirements. This includes asking themselves the following questions:

  • What requires the best performance: random or sequential I/O?
  • Which is more important: reliability or speed?
  • What storage protocols and topologies must be supported?
  • How quickly and to what point in time is recovery required after a disaster or hardware failure?

Clustering has been hitting the news headlines in the last year. For example, EMC Corp. now supports cluster storage for archiving and backup; Hewlett-Packard (HP) Co. bought PolyServe and its clustered file server; IBM Corp. recently purchased XIV Ltd., a privately held storage technology company based in Tel Aviv, Israel; and Sun Microsystems Inc. acquired the Lustre file system.

While definitions vary, clustering generally refers to an architecture in which multiple resources (such as servers or storage arrays) work together to increase reliability, scalability, performance and capacity. Technically, clustering can be done at the level of the disk drive as with RAID, in which multiple disk drives increase the scalability and reliability of the array. But the more common definition of clustering has it being done at the file server or file-system level (see "Cluster vs. Grid vs. Global namespace," below).

Cluster vs. Grid vs. Global NameSpace

Even within the storage community, it's often hard to nail down the differences among clusters, grids, and other concepts such as global namespaces and storage virtualization.

At the Los Alamos National Laboratory in New Mexico, Gary Grider, deputy division leader of the laboratory's HPC Systems Integration Group, refers to his complex of tens of thousands of processors as a cluster rather than a grid. In the high-performance computing (HPC) world, he says, a grid usually implies processors linked by a WAN.

Some industry observers say a grid implies higher numbers of commodity components that work together in a tighter linkage than in clusters. Others compare both concepts to a global namespace, which "is software that runs on different servers and allows them to run a shared directory," says Greg Schulz, founder and senior analyst at StorageIO Group, Stillwater, MN. "A global namespace is a virtualization view without physically aggregating or consolidating any of the underlying file systems or storage systems," he adds.

A clustered file system consists of a file system and volume manager installed across multiple application server nodes, says Greg Schulz, founder and senior analyst at StorageIO Group, Stillwater, MN. It creates a single logical data view, allowing any node to access data regardless of its location. Providing this shared access requires coordination among server nodes to prevent access conflicts and ensure data integrity. This approach is the foundation of NAS clustering, in which a file system is installed across multiple industry-standard servers.

A clustered file server, by contrast, consists of multiple NAS servers working as a single storage and file space instance, says Schulz. This creates a single, cohesive file system in which any node can access the file system as if it were running on one large physical NAS server, with data access coordination done at the individual file server with no need for communication among the nodes. Storage clustering isn't the mere presence of dual CPUs, controllers or even NAS heads in an active-passive configuration with the second component on standby in the event the primary component fails, notes Schulz.

Clustered storage is often a good fit for high-performance computing (HPC) apps in which hundreds, thousands or tens of thousands of clients read and write data to extremely large data sets. But much of the current focus on HPC is on its use for everyday business apps such as databases and messaging systems.

Various clustered storage systems also differ by the protocols or file systems they support, as well as the levels at which they store and serve data. According to StorageIO Group's Schulz, clustered storage systems that move data at the block level using the iSCSI protocol include 3PAR Inc., EqualLogic Inc. (now part of Dell Inc.) and LeftHand Networks Inc. Clustered systems from BlueArc Corp., Ibrix Inc., Isilon Systems Inc., Network Appliance Inc., ONStor Inc., Panasas Inc. and PolyServe (now owned by HP) support the NFS file system. Sun has made Lustre a key part of its clustering strategy, especially for HPC apps, and has pledged to support it on Linux, its own Solaris OS and on multivendor hardware.

Many clustered file systems only update meta data across all the storage nodes to reduce network bandwidth requirements, and to make it easier for large numbers of clients to access data concurrently.

EMC Dampens Clustering Rumors

At EMC Corp.'s Innovation Day last November, CEO Joe Tucci hinted that the company is preparing clustered storage products code-named Maui and Hulk. But the company later said Maui "is not a clustered file system or NAS solution, nor will it compete against existing EMC storage solutions." Maui will instead, according to EMC, provide intelligent storage for "companies requiring global scale/multipetabyte storage." EMC currently addresses clustered file systems with its Celerra systems that cluster multiple NAS heads within the same cabinet, according to the company.

Ease of use
When choosing clustered storage, storage administrators often focus on benefits such as ease of use, scalability or reliability, rather than on the underlying technology.

Jeff Pelot, chief technology officer at the Denver Health Hospital and Medical Center, is phasing out his Fibre Channel (FC) storage in favor of clustered storage from LeftHand Networks. While lower hardware costs were the original goal, lower administrative costs are now the main driver. "I don't know how many [other] people out there are managing 90TB with one person," he says.

HPC users often calculate the performance of clustered storage requirements not by total capacity, but by the GB/sec the system can deliver. The Los Alamos National Laboratory in New Mexico, which simulates the performance of nuclear weapons, needs a specific rate of I/O because it must periodically capture the entire "state" of an HPC compute run (including the contents of RAM) to recover from a failure among the thousands of processors and disks in use, says Gary Grider, deputy division leader of the laboratory's HPC Systems Integration Group.

"It's the [file] system that keeps our calculations stable when processors, memory and the network" fail periodically during the months it takes a complex computing job to run, says Grider.

Parag Mallick discovered that sometimes a vendor's help is required to decide how to distribute data within the file system. Mallick is director of clinical proteomics at the Spielberg Family Center for Applied Proteomics at the Cedars-Sinai Louis Warschaw Prostate Cancer Center in Los Angeles. The organization uses clustered storage devices from Isilon Systems Inc. to store 120TB of data about the proteins found in the blood of cancer patients. Mallick worked with Isilon Systems to design the best directory layout to maximize the predictive caching done by the system, storing the most frequently accessed files in RAM to speed their retrieval.

Different data access needs also call for various clustering technologies. "If I'm supporting a video application, I might want a NAS file server cluster optimized for large throughput of either concurrent streams or one large file being transmitted in parallel," says Schulz.

Clustered storage is a good choice for storage administrators who have "20, 30 or 50 NAS filers ... and really would like to put them up under one namespace," says Grider. He also recommends clustered storage for applications that require many processors/users to write to one file simultaneously, when storage administrators need to scale bandwidth at the same time they scale capacity or if there's lots of meta data to manage.

What's next?
Industry observers predict that more vendors will develop cluster or grid-based storage systems similar to the Google File System (GFS) developed by the search giant for its own massive storage needs. As described in a 2003 research paper called "The Google File System" by Sanjay Ghemawat, Howard Gobioff and Shun-Tak Leung, GFS is a "scalable distributed file system for large distributed data-intensive applications. It provides fault tolerance while running on inexpensive commodity hardware, and it delivers high aggregate performance to a large number of clients."

Storage industry consultant Robin Harris praises GFS for its reliability, performance on large sequential reads, features such as automatic load balancing and storage pooling, and its low cost. But its shortcomings as a general-purpose storage platform include its "performance on small reads and writes, which it wasn't designed for and isn't good enough for general data center workloads," notes Harris.

Google has since built Bigtable, a distributed storage system for managing petabytes of structured data. It "provides very good performance for small reads and writes," says Jeff Dean, a Google Fellow in the Systems and Infrastructure Group.

Analysts and other observers differ about whether these mega-storage projects could serve as a foundation for commercial systems. If Google were starting today, "we'd probably still build our own because I'm not aware of any system that scales to the sizes that we need at reasonable price/performance ratios," says Dean. Building its own systems, he says, gives Google "more flexibility because we can control the underlying storage system that sits underneath our applications."

Click here for a sampling of
clustered storage vendors (PDF).

In January, IBM announced plans to acquire XIV, which claims its grid-based architecture creates an unlimited number of snapshots in a very short time by replicating data among Intel-based servers running a custom version of Linux and linked by redundant Gigabit Ethernet switches. Because each node has its own processors, memory and disk, according to the company, CPU power and the memory available for cache operations increases as storage capacity rises. IBM says that by distributing each logical volume across the grid as multiple 1MB stripes, the architecture provides consistent load balancing even as the size of volumes or drive types on the grid changes.

The technology will be aimed at users running Web 2.0 applications and storing digital media. But speaking in a conference call sponsored by the Wikibon consulting community, storage consultant Josh Krischer pointed out that the system doesn't support mainframe connectivity and constitutes "another level of storage between the high end and the top of the midrange" in IBM's current storage offerings. Rather than being optimized for Web 2.0 storage, said Krischer, "this is general-purpose storage" that IBM will bring to the market at aggressive price points because of its use of industry-standard hardware and open-source software.

Architectures such as GFS "will be more of what you see in the future," says John Matze, one of the architects of the iSCSI protocol and VP of business development at IP SAN vendor Hifn Inc. As network bandwidth becomes less expensive and storage nodes become more intelligent, he predicts the rise of more cluster or grid-like storage environments in which individual nodes have the intelligence to recover from inevitable failures.

Several vendors offer highly distributed storage designed to deliver very high levels of security, redundancy, availability and scalability at lower price points than RAID. For example, NEC Corp.'s Hydrastor distributes data among servers that act as accelerator nodes or storage nodes, allowing users to independently scale the amount of storage or the processing power devoted to managing it.

NEC claims Hydrastor is more scalable than clustered storage systems, "which have a finite number of controllers and storage capacity, as well as centralized file-system information and mapping tables on each node. Together, these factors create hard upper limits to scalability, forcing end users to deploy and manage additional systems that function as isolated data silos," writes an NEC spokesperson in an email.

According to NEC, the storage nodes in Hydrastor behave as a single, self-managed pool of storage that automatically balances capacity and performance across nodes, and rebalances storage among nodes as capacity is added or a node fails. In addition to lowering management costs, claims NEC, this gives Hydrastor "near limitless" scalability.

StorageIO Group's Schulz challenges NEC's claims of differentiation. "Hydrastor is a cluster as much as it is a grid, just like many other cluster-based systems are also grids," he says. He also disagrees that a grid is more scalable than a cluster, saying "it comes down to the architecture and the implementation in question and, more importantly, what is practical in terms of real-world supportable and shippable solutions vs. theoretical marketing."

Dispersed storage software from startup Cleversafe Inc. distributes data on multiple remote servers in encrypted slices that, claims chairman and CEO Chris Gladwin, eliminate the need to store multiple copies of data (as with replication).

Gladwin says NEC's entry into the market validates the idea of clustered or grid storage. "The real competition that both NEC and Cleversafe have is the old way of doing things," he says. No matter how it's done, clustering and grid storage are the new ways of doing things, and it's up to storage administrators to choose their approaches carefully (see "Eight questions to ask a clustered storage vendor," below).

Eight questions to ask a clustered storage vendor

    1. Will the vendor design an optimal file structure to take advantage of predictive caching or other features of their system?

    1. What protocols do the clustered systems run? Can they take advantage of newer, lower cost protocols such as iSCSI?

    1. Does the product require agents, adapters, initiators, drivers or other software to run on the application servers? These can add complexity or possibly hurt performance.

    1. Is the product designed to deliver good sequential performance for streaming video or random I/O for transaction processing?

    1. How many nodes will the cluster support without sacrificing ease of management and performance?

    1. If additional hardware nodes are added, does the vendor charge more for clustering software?

    1. If a node fails, how does the system handle data migration or replication to ensure uninterrupted application and data availability?

  1. How easy is it to add new resources to the cluster?

Dig Deeper on Primary storage devices

Disaster Recovery
Data Backup
Data Center
and ESG