What's inside internal storage clouds?
If you thought you knew cloud storage, think again. With scores of vendors touting internal storage clouds, we'll give you the lowdown on what makes a storage cloud "internal."
Just about every vendor is touting some kind of cloud storage product or service. Here's the lowdown on what constitutes an internal storage cloud.
"There is no such thing as a private storage cloud today," declares Stephen Foskett, director of consulting at Nirvanix Inc., a public cloud storage vendor. Maybe so, but that didn't stop the General Services Administration (GSA) from issuing a request for quotes (RFQ) in early August for what appears at first to be a private storage cloud.
But what the GSA considers a private or internal storage cloud may differ considerably from what most enterprises would consider an internal cloud. As noted in the RFQ: "The initial acquisition of these services will be facilitated by GSA through the GSA Cloud Computing Storefront Site -- which will enable Government purchasers to buy (using a credit card or other acceptable payment option) Infrastructure as a Service (IaaS) offerings as needed through a common Web Portal, called the Cloud Computing Storefront, which will be managed and maintained by GSA."
Even given that there's no commonly accepted definition for internal storage clouds, the GSA's RFQ seems to describe something completely different. The Feds are asking not for an internal storage cloud or a public storage cloud but for what they label an internal Cloud Computing Storefront, a portal or gateway through which Federal agencies can purchase and access public cloud storage services for their internal use. Even Foskett at Nirvanix, which is preparing a response to the RFQ, seemed puzzled.
The government seems to be on the right track in one regard. However you define internal storage clouds, they promise to reduce storage costs and simplify the storage process. According to the GSA, "Cloud computing has the capability to reduce the cost of IT infrastructure by utilizing commercially available technology that is based on virtualization of servers, databases and applications to allow for capital cost savings …". The GSA initiative encompasses both storage and compute clouds.
The problem with internal storage clouds isn't that they don't exist, but that there are too many versions of what an internal storage cloud could be. "The cloud refers to a layer of abstraction" said Greg Schulz, founder and senior analyst at Stillwater, Minn.-based StorageIO Group. "Almost any storage product can be configured as part of an internal storage cloud. It comes down to your definition. A vendor will define the storage cloud to fit whatever he is selling."
Although there's no widely accepted definition for an internal storage cloud, industry analysts have been identifying the elements needed to create one and explaining how those pieces might be connected. And despite the cloud mystique, "anybody can do this," said John Webster, principal IT advisor at Nashua, N.H.-based Illuminata Inc. Internal cloud storage isn't brain surgery.
Although internal storage clouds are a rarity today, it's clear what their appeal will be. "This is about performance vs. cost. The internal storage cloud is focused on cost," said Carter George, vice president of products at Ocarina Networks. Conventional storage consisting of sophisticated storage arrays, storage-area networks (SANs), high-performance disk drives, and elaborate backup and recovery, by contrast, focuses on performance and data protection.
But low cost need not be the primary focus, according to Abbott Schindler, an independent storage consultant in Bend, Ore. Cost is top of minds today, Schindler said, because "most start with clouds by thinking about archival storage or data protection so they design it for cheap and slow. There is nothing inherent in the cloud concept, however, that says it cannot be used for transactional data."
Internal storage cloud defined
You could say an internal storage cloud is the same as a public storage cloud -- storage delivered as a service over the network -- except the components of an internal storage cloud sit behind the firewall. But even that definition isn't completely accurate. A public storage cloud provider, for instance, can reserve a portion of its capacity for the exclusive use of one customer, making it a private storage cloud although it's not internal to the customer (see "Different types of storage clouds," below).
|Different types of storage clouds|
Public storage clouds. Services like Amazon's Simple Storage Service (S3) and Nirvanix Inc.'s Storage Delivery Network make massive amounts of file storage available at low cost. Multi-tenancy allows the providers to keep each customer's storage and apps separate and private. Portions of the public storage cloud can be carved out to create what amounts to a private storage cloud.
Private storage clouds. With a private storage cloud, a company owns or controls the infrastructure and how applications are deployed on it. Private clouds may be deployed in an enterprise data center or at a co-location facility. Private clouds can be built and managed by a company's own IT organization or by a service provider.
Internal storage clouds. This type of storage cloud is similar to a private storage cloud except that it remains inside the organization's firewall. It may be built with the help of consultants or integrators, but it's hosted and maintained by the IT department.
Hybrid storage clouds. A hybrid storage cloud combines attributes of both public and private/internal clouds. It's mainly used to access on-demand, externally provisioned capacity on a temporary basis. The ability to augment a private or internal cloud with capacity from a public cloud can help a company maintain service levels in the face of rapid workload fluctuations or planned workload spikes. Hybrid clouds, however, introduce the complexity of determining how to distribute applications across both a public and private cloud.
Source: Sun Microsystems Inc.
Rather than specifically define the internal storage cloud, industry analysts and consultants prefer to describe its attributes.
For example, the focus clearly is on low cost and easy scalability. "There's a big financial aspect to storage clouds" said Anand Prahlad, CommVault's vice president of product development. "Not only is it expected to be low cost, but you pay only for what you use." Simply put, internal storage clouds are expected to deliver cheap storage.
And not only cheap storage, but slow as well. Consultants like Schindler, however, don't rule out better storage performance or different service levels as part of the internal storage cloud.
Manageability represents another distinguishing factor. "With an internal storage cloud you want to abstract away the complexity of the storage," said David Allen, chief technology officer (CTO) at i365, A Seagate Company. As a result, the private storage cloud should be easier to manage, enabling a single administrator to handle hundreds of nodes and petabytes of storage. However, the administrator's responsibilities may be limited to a handful of simple tasks.
Finally, how the private storage cloud is accessed can be a key distinguishing factor. HTTP will be the dominant access protocol. "All you want is HTTP or HTTPS connectivity and a Web browser," suggested Ken Satkunam, CTO at SentryBlue in Fargo, N.D.
"A big difference with the internal storage cloud is that it's accessible through an API, not a protocol," Nirvanix's Foskett said. "It will have a programmable API just like a website, maybe use REST over HTTP." Representational State Transfer (REST) is a stateless protocol that includes the state with every communication, the opposite of Fibre Channel (FC). REST provides access to Web services using HTTP; for storage clouds, REST would be used to access storage resources as services.
In a recent whitepaper, Sun Micro-systems Inc. insists on this type of programmability in the storage cloud. "Instead of physically deploying servers, storage, and network resources to support applications, developers specify how the same virtual components are configured and interconnected, including how virtual machine images and application data are stored and retrieved from a storage cloud. They specify how and when components are deployed through an API."
But the industry hasn't standardized on a cloud API, StorageIO Group's Schulz noted, and every cloud provider offers its own. In late July, however, Rackspace Hosting made the API specifications for its public Cloud Servers and Cloud Files open under the Creative Commons 3.0 Attribution license. This might eventually give would-be internal storage cloud builders an open API to get started.
One final characteristic -- multi-tenancy -- defines the public storage cloud. "Multi-tenancy is an important part of the storage cloud and even the internal storage cloud," CommVault's Prahlad said. With internal or private cloud storage, multi-tenancy would let the organization separate departments, projects and workgroups as needed.
So what is an internal storage cloud? The consensus definition appears to be private storage capacity owned or at least controlled by the company, accessible programmatically over an HTTP connection and capable of delivering low cost, highly scalable storage with easily managed multi-tenancy. ParaScale Inc. adds that an internal storage cloud can be small (as few as three to five nodes), and still deliver the economies of cloud storage as well as the ease of management and scaling associated with the cloud.
Internal storage cloud options
If the internal storage cloud seems familiar, it is. "The storage grid has morphed into the private storage cloud," consultant Schindler said. Before the storage grid, utility computing packaged computing and storage resources as a metered service. Both concepts are similar, although the technology and architecture is different. "They were all about storage nirvana: accessing the data you want, where and when you want it, and at the cost you want," Schindler added, and without regard for what the actual storage device was or where it resided on the network.
The internal storage cloud is also similar to a network-attached storage (NAS) cluster, but with some caveats. "I'm not sure clustered NAS will scale to true storage cloud size," CommVault's Prahlad said. Although an internal storage cloud can start small, companies will want it to scale out by adding more devices.
When it comes to internal storage cloud products, the current choices are pretty thin or remarkably wide, depending on how you define the internal storage cloud. For actual products, EMC Corp. offers Atmos, which it describes as an offering for information storage and distribution. With Atmos, EMC stores and replicates a company's data through its global network depending on the service level you want. It uses business policies, policy-driven automation and metadata to manage a company's data in this vast storage cloud, and promises operational efficiency, reduced management complexity and cost savings.
AT&T is EMC's showcase customer for Atmos as a private storage cloud. But AT&T isn't really using it as a private cloud. Instead, it will offer services involving storage through Atmos to its own customers, which is more like a public cloud reseller.
Contrary to popular assumptions, there are no giant EMC storage arrays behind Atmos. "That would be way too expensive," Nirvanix's Foskett said. Instead, Atmos' scalable capacity is delivered as JBOD. With Atmos, you get what amounts to a box in your data center with an API and a NAS interface. Or you can use a chunk of the public Atmos storage cloud as a private cloud.
ParaScale offers software specifically for creating and managing an internal storage cloud. Unlike cloud service providers, it sells only the tools that let companies build their own storage clouds. Its software runs on standard x86-based Linux servers and aggregates the direct-attached disks on multiple servers into petabyte-scale file storage in a single namespace.
Beyond Atmos and ParaScale, commercial internal storage cloud products are pretty scarce. "After those, anyone that talks about private cloud isn't really a cloud," Foskett said. Rather, they probably offer storage products that incorporate virtualization at some level, which they're presenting as a cloud. "Often, they're offering their usual product and just sticking the 'cloud' term on it," he added. Similarly, almost any NAS cluster can be presented to look like an internal storage cloud.
Building internal storage clouds
"DIY is a big thing with internal clouds," consultant Schindler said. Do it yourself is popular because, as Illuminata's Webster noted, it simply isn't that difficult to assemble a private storage cloud (see "Essential internal storage cloud components," see below).
|Essential internal storage cloud components|
Here's a do-it-yourself parts list for building an internal storage cloud.
Source: Greg Schulz, StorageIO Group
There are many ways to design and build a private storage cloud. The simplest may be to "start with a NAS cluster, preferably with a global file system, and put on a cloud [Web] front end," i365's Allen said.
The actual storage behind the internal private cloud varies. You probably won't have a storage array as part of the private cloud. "Most will use commodity servers and fill the disk slots with low-cost drives," Nirvanix's Foskett said.
A variation: "Use racks or blade cabinets filled with Linux server blades and disk," SentryBlue's Satkunam said, adding that "the ability to use locally attached disk makes it much less expensive than SAN storage."
The key to building a scalable internal storage cloud "is to start with a lot of little boxes and scale out by adding more boxes," CommVault's Prahlad said. You get data protection through redundancy by replicating the data among the many nodes. To get quality of service, different nodes can have different service performance attributes.
The glue that ties it all together is "a global file system presenting a single name space," Prahlad noted. This may also entail a virtualization and metadata layer.
Management of the internal storage cloud should be simple. "You have to look at websites like Amazon and Facebook for your model. You want whole file storage over HTTP," Ocarina Networks' George explained. For simplicity, limit your file management options to create, read, update, delete and move/copy.
The internal storage cloud doesn't replace an organization's tier 1 storage. Production data continues to run on the high-performance FC SAN or primary iSCSI SAN where it's backed up and protected. Instead, the internal cloud would be used for all the file-based data eating up primary disk space and complicating backup and recovery strategies, as well as for email, archival, media and compliance data. That data is still active, widely used and changed; it needs to be stored and shared but without the expense, performance and service levels associated with tier 1 production storage.
The latest Wave study (January 2009 to May 2009) from New York City-based TheInfoPro asked about interest in clouds in general. "The interest level was light, maybe 12% to 15%," reported Robert Stevenson, TheInfoPro's managing director of storage research. "Most [respondents] had no plans for the cloud." Large companies apparently aren't clamoring for internal storage clouds or cloud computing at this point.
They may, however, already be mimicking internal storage clouds but not realize it as they pop virtualized servers with attached disk onto the network. It's a small step from that to an actual internal storage cloud.
BIO: Alan Radding is a frequent contributor to Storage magazine.