Sergey Nivens - Fotolia
7 hyper-converged secondary storage questions answered
Combining HCI and the use cases of secondary storage, such as data recovery and archiving, was a challenge, but a few companies took up that task and launched a new market.
The early use cases for hyper-converged infrastructure centered on functions that rely on primary storage, such as virtual desktops and the applications that they run. By the mid-2010s, a couple of startups recognized a market for hyper-converged secondary storage and produced HCI appliances that had been specifically designed to meet the needs of applications such as backup, archiving and data recovery. To achieve this, the startups -- Cohesity and Rubrik -- included software that integrates the various secondary storage functions and manages it all via a single interface. That made both the hardware and the software built on the principles of convergence.
You could argue that hyper-converged secondary storage appliances are even better tuned to take advantage of the benefits of hyper-convergence. Because the physical components are suited specifically for storage, it is less likely that users would overprovision compute capacity when they add nodes to expand storage capacity. A standard HCI product is intended to share compute and storage resources equally into a shared resource pool, so any node added to the cluster would increase both resources. With a purpose-built, HCI-based secondary storage appliance, more storage is what you need most, and more storage is what you get most.
What separates hyper-converged secondary storage from standard HCI?
According to analyst firm Taneja Group, secondary workloads, such as disaster recovery, archiving, data protection and the like, must be added to an HCI system for it to qualify as hyper-converged secondary storage. Often, these systems will also include data deduplication, compression and encryption, but they are features also found in HCI used for other storage purposes.
Aside from the various forms of data protection and archiving, what is the most common application in secondary storage on HCI?
Data analytics taps into the information stored on lower tiers to help an organization discover hidden value. Since as much as 90% of an organization's data is in secondary storage, there is much potential value to be uncovered.
What data sources can an HCI-based secondary storage system draw from?
The two early players in the market, Cohesity and Rubrik, focused on virtual machines at the start, with support for the cloud. But in the past two years, each company has added support for physical servers to their hyper-converged secondary storage products -- DataPlatform for Cohesity and Cloud Data Management for Rubrik.
What are other hyper-converged secondary storage vendors in addition to Cohesity and Rubrik?
Two companies associated mostly with backup software recently entered the hyper-converged secondary storage market. Commvault announced its product dubbed HyperScale in 2017. Two years prior to Commvault's entry, Asigra launched its Converged Data Protection Appliance to get into the growing market.
How do the vendors handle the data files to be stored?
Each of the four vendors mentioned so far uses a scalable distributed file system to manage the data they place in secondary storage. Cohesity and Rubrik have proprietary file systems -- SpanFS for Cohesity and Cloud-Scale File System for Rubrik. Asigra uses the open source Z File System, or ZFS. Commvault has teamed up with Red Hat to use the GlusterFS file system for its hyper-converged secondary storage appliance.
Which data storage types do HCI secondary storage systems support?
Even though most of the vendors in the market started out focusing on data protection, most of them now support file, block and object storage. This enables them to support all use cases of secondary storage and all possible storage locations, including the cloud.
Do all hyper-converged secondary storage systems include backup and archiving functions?
ExaGrid Systems sells a product that it labels as HCI-based secondary storage, but it relies on partners such as Veeam to provide the backup technology. The system is a data repository or, as ExaGrid calls it, a data "landing zone." However, it is a hyper-converged storage system and is marketed specifically for use in secondary storage applications.