Kit Wai Chan - Fotolia
Startup Hammerspace is taking aim at data silos with its new SaaS product that claims to give companies access to their unstructured data whenever and wherever they need it across on-premises and public cloud sites.
As Hammerspace CEO David Flynn likes to say, "We think data should be like the air you breathe."
Flynn is no stranger to newfangled technology ideas. He co-founded PCI Express flash pioneer Fusion-io -- acquired by SanDisk in 2014 -- based on the premise that data should be close to the processor. He moved on to start the data virtualization startup Primary Data with an executive team that included Fusion-io veterans Lance Smith and Rick White. Apple co-founder Steve Wozniak served as chief scientist of Fusion-io and Primary Data.
Primary Data suspended operations in January when its financial backers pulled the plug. But angel investors supplied a lifeline, enabling Primary Data's technology to live on in the new Hammerspace data-as-a-service product. Hammerspace's 25-person staff features Primary Data's most talented developers, including CTO Trond Myklebust, the chief Linux kernel maintainer for NFS, according to Flynn.
Shift to cloud focus
Flynn now thinks Primary Data was "somewhat doomed to fail" with its focus on a single data center, traditional IT and infrastructure operators. In contrast, Hammerspace can run in the cloud and provide a control plane for file- and object-hosted data spanning on-premises and public cloud environments.
"The Primary Data technology worked fine. It's just nobody was interested in that," Flynn said. "They were interested in moving to hybrid cloud."
Like Primary Data, Hammerspace separates the control path from the data path to enable the software to manage files and objects independent of the underlying storage infrastructure. A metadata engine points to storage systems and extracts information about the data. When the user needs to access data, the metadata service supplies its physical location.
Hammerspace is blazing new trails in the way it collects and uses metadata. Flynn said the product enables a "quantum leap of sophistication" over what traditional Portable Operating System Interface file systems and object stores can do.
Hammerspace's metadata contains not only trivial information such as a file's creation date or the last time it changed, but it can also add attributes, tags, labels and keywords to help users identify and find the information they need faster without having to open files. Users can also define and program the metadata.
Hammerspace stores the metadata in its own database on local flash block storage. The metadata database is replicated synchronously for high availability within a single data center, and relevant metadata is replicated asynchronously across data centers, according to Douglas Fallstrom, vice president of products and operations at Hammerspace.
The metadata can be ad hoc or structured, Fallstrom noted. Users have the ability to define and program it to create a feedback control loop and enable the system to react to metadata changes by automatically moving data.
Fallstrom said customers can determine exactly where they want to store instances of the actual data or they can let the Hammerspace data control plane make those decisions. Hammerspace uses telemetry, machine learning and continuous optimization to decide which storage best meets its performance, protection and cost objectives.
The product supports WAN-optimized, site-to-site data transfers and globally deduplicates and compresses data. Hammerspace can also move data to the cloud and currently supports tiering to AWS Simple Storage Service (S3). Support for Microsoft Azure and Google will follow, according to Fallstrom.
Siloed data is the enemy
"Our No. 1 enemy is data being siloed into separate storage," Flynn said. "That siloing restricts access and hinders performance. The copying of data between silos only makes the problem worse. You end up making more data in more places and increasing the challenge of data management."
Hammerspace abstracts and hides the multiple copies of data from the user, providing access to the data through a global namespace that can span on-premises and public cloud sites. Hammerspace currently supports up to four sites, but Fallstrom said the company could support more in the future; the design limit is 16.
The initial Hammerspace product enables data access only through NFS and SMB file protocols. Hammerspace recently added beta-level support for object access through the Amazon S3 API. Fallstrom expects the beta period to run for about eight weeks.
Hammerspace is designed to run on physical or virtual servers located at the customer site and in public clouds, such as AWS, Azure and Google. Flynn said Hammerspace plans to add support for container deployments in future releases.
"We are not storage. We're the thing that enables data to flow freely across the environment," Flynn said. "This works with your existing infrastructure and can even leave it unchanged and replicate from it without disturbing the current running environment."
Primary Data users had to endure a short application outage to unmount and remount the storage. But Hammerspace can non-disruptively start adding metadata to files without having to shut down applications, workflows or systems, according to Fallstrom.
Hammerspace targets enterprises in data-heavy industries such as media and entertainment, oil and gas, bioinformatics, and life sciences. The service also takes aim at artificial intelligence, big data and internet of things workloads that produce massive amounts of data.
Howard Marks, founder and chief scientist at DeepStorage, said Hammerspace's technology looks promising with its Swiss-army-knife approach to solving complex problems. He expects the Silicon Valley startup will fare better than Primary Data did.
"Primary Data addressed problems, but those problems could also be addressed by throwing money at them. If you had enough money to throw at them, you weren't looking for a solution. Therefore, you never found Primary Data," Marks said. "There are no good solutions to the problems that Hammerspace is addressing. The hybrid cloud file system is a much better market.
"The problem with hybrid cloud is data gravity and a file system that automatically replicates data and lets you address the same data from AMIs [Amazon Machine Images] when you're doing your analytics and from the sensors on the factory floor into your on-premises data center," Marks continued. "How else are you going to get the data from one place to another? You could write a whole lot of scripts and replicate and copy."
George Crump, founder and president of Storage Switzerland, said the ability to add custom tags at the metadata level to perform data cataloging, indexing and searching will become increasingly critical for customers facing compliance, security and data privacy challenges as their unstructured data grows exponentially.
"I like the concept of the product. There's a definite need for it," Crump said. "Their challenge is going to be to make IT realize that the time is now to address the problem."
Crump said Hammerspace's competition could include vendors such as Elastifile, Scality, SoftNAS and SwiftStack, although those vendors don't offer all of the analytics, machine learning and tagging capabilities that Hammerspace does.
Hammerspace has a pay-as-you-go, consumption-based pricing model that starts at $1,000 for up to 10 TB of data under management. Because Hammerspace automatically deduplicates and compresses data that goes into object storage, the 10 TB potentially represents far more data, Fallstrom noted.
Hammerspace sells no hardware, but Fallstrom said the company makes recommendations for server and storage configurations and works with partners on complete solutions. Hammerspace's partners include NetApp, Western Digital, Red Hat and Cloudian.