Object storage vendors turn to analytics, AI, machine learning
The view of object stores as nothing more than cheap and deep storage is changing, as the technology finds its way into AI, machine learning and analytics use cases.
Nvidia's recent acquisition of SwiftStack to bolster its artificial intelligence stack underscored the ways that object storage is expanding beyond backing up and archiving cold data.
High-performance object stores are taking on artificial intelligence (AI), machine learning, analytics and container-based workloads, as enterprises try to gain insight into their unstructured data. Some use fast flash storage to accelerate small-file throughput. SwiftStack claimed GPU servers working in parallel could access petabytes of data stored on spinning disk-based object clusters at a rate of more than 100 GBps.
The fresh use cases stand in contrast to the traditional view of object stores as cheap-and-deep repositories for cold or cool data that IT organizations want to move off faster, more expensive storage tiers. Object stores could scale out to commodity server hardware to keep up with rapid unstructured data growth.
Amita Potnis, a research director at IDC's enterprise infrastructure practice, predicted that backup and archive would continue to be the "bread-and-butter" use case for object storage for a long time. But she noted that online surveys and phone interviews with cloud providers and enterprises have shown demand and a gradual ramp in adoption of object storage for purposes such as AI and big data analytics.
Ramping up for AI, analytics
Potnis said more vendors have been turning their focus to those use cases during the past 12 to 18 months. She said newer players such as MinIO and OpenIO are also targeting the big data analytics and AI space.
"It's slow and steady moving in that direction," Potnis said. "These are workloads where the amount of data generated and stored is extremely high, and the use of object storage is more viable because of its proven scale and economics. What people are working on now is performance. That was the part that was lacking."
Enrico Signoretti, a research analyst at GigaOm, said many established vendors would not be able to grow or compete without making radical changes to their object storage. Signoretti predicted a string of announcements focusing on new use cases and at least one more object storage acquisition before the end of 2020.
Nvidia's acquisition of SwiftStack was the second object storage acquisition of the year. In February, Quantum bought Western Digital's ActiveScale, after selling the product for years, to bring an archive tiering option into the fold for its higher performing StorNext file storage.
Signoretti said object stores that integrate with or target other products and applications in a vendor's portfolio could be well-positioned. He pointed to examples such as Hitachi Content Platform (HCP), which integrates with the vendor's analytics stack, NetApp's StorageGrid, and Red Hat Ceph, the storage of choice for the vendor's OpenShift container platform.
But Signoretti said object storage specialists that focus merely on Amazon S3 API compatibility and a basic feature set could fall into oblivion because customers have plentiful options from storage vendors and cloud providers.
Tough to compete with AWS, Google, Microsoft
"Amazon, Google, Microsoft are still fighting each other on a dollar-per-gigabyte basis. So, it becomes really, really tough to compete with these guys," Signoretti said.
Chris Evans, who runs storage consultancy Brookend Ltd., said the race to the bottom on price and the emergence of effectively free open source options, such as MinIO, have spurred existing players to move past simple object storage. He noted that Scality added the SOFS scale-out file system and Zenko multi-cloud orchestration, and Cloudian tacked on files services and an Edgematrix subsidiary focused on AI processing.
Chris EvansDirector, Brookend Ltd.
Evans said new entrants Vast Data and Stellus that offer S3 connectivity refer to their products as "data platforms" rather than object storage. Pure Storage does the same with FlashBlade, he said.
"I suggest that the term 'object store' is becoming tainted, and as a result, vendors are looking for value-add to be able to charge more," Evans said. "Being a 'data platform' sounds better. If you're just an object store vendor, I don't think there's a big future ahead, because the value is in what's done with the data, not how it's stored."
Marc Staimer, founder and president of Dragon Slayer Consulting, said stand-alone object storage companies won't all go out of business, but some will get acquired and most may not be around for the long haul. He said the future is data management, and storage is a commodity that is simply fast or slow and expensive or cheap.
"Storage has always been the tail that wags the IT budget, because you've got to keep storing the data. But that's changing, too," Staimer said. "IT is now looking at it and saying, 'I don't want to have to keep buying storage for all the stuff I've stored in the past plus all the stuff I'm going to store. I want to be able to manage the data so that I'm not going to keep it on expensive storage. I'll keep it somewhere else -- on tape, in the cloud, on the object store, wherever it makes the most sense.'"
Staimer said object storage is growing mainly in the cloud, and its chief selling points remain high scalability and good throughput at low cost. He said he does not see lots of end users moving to object stores for AI and ML.
But Cloudian CTO Gary Ogasawara cited Nvidia's acquisition of SwiftStack as evidence that object storage is a "valued" technology for AI-machine learning workloads. He said one government customer is streaming data from sensors to forecast the weather, and he sees use cases with autonomous cars.
Ogasawara said Cloudian is devoting considerable attention to its HyperStore Analytics Platform (HAP). HAP packages AL and machine learning software on the same hardware as its HyperStore object storage for customers who want to use frameworks such as Apache Spark or TensorFlow with their data. Future directions for Cloudian include developing new data APIs to support AI and machine learning and offering all-flash appliances for latency-sensitive applications, Ogasawara said.
"Where I see the next battlefield is in more advanced APIs and being able to take advantage of semi-structured then structured data. That's things like adding SQL type query functionality, adding the ability to use it really easily for AI and machine learning workloads," Ogasawara said. "It's really how do we make object storage smarter."