Two new Google Cloud Platform storage services provide increased speed and data access for file systems that underpin AI and machine learning applications, a major focus of next week's Google Cloud Next conference.
Google Storage Fuse, available now, enables applications to access Google Cloud object storage buckets as files within a customer's file system, removing the need to refactor application code.
Parallelstore, available in private preview, provides a managed parallel file system for customers that want to minimize the effects of storage IOPS when developing machine learning (ML) applications or high-performance compute.
The hyperscaler will also sell Google Cloud NetApp Volumes as a managed service for traditional enterprise file storage workloads. The service is available now and supported by Google. It replaces a managed file service on Google Cloud that was supported by NetApp.
Google Storage Fuse and Parallelstore are targeted at developers, Google's primary audience, said Steve McDowell, an analyst and founding partner at NAND Research.
"Google's always been for the technical audience," he said. "[That audience is] doing the AI and analytics."
Fusing storage in parallel with AI
Google Storage Fuse builds off the open source Filesystem in Userspace (Fuse) project, which allows developers to create file systems within Linux for applications. Google Fuse, available as a managed or nonmanaged service, acts as an application within the customer's Linux environment, with Google Kubernetes Engine integration available. The application supports machine learning frameworks PyTorch and TensorFlow.
Having access to cloud object storage through a file storage system enables AI and ML developers to potentially save money on development by using less costly object storage over file, according to Sean Derrington, group product manager of storage at Google Cloud. If access speeds aren't as much of a priority, developers could also use less expensive colder storage tiers.
Other cloud storage vendors such as Nasuni have provided features that let object storage act as localized file storage, said Naveen Chhabra, an analyst at Forrester Research. Providing cloud object storage as if it were available locally for applications seeking file systems is a useful new capability other hyperscalers haven't built yet, he said.
"If this can provide accessibility as if the storage was local, [that] would be a game changer," he said. "You don't hear many hyperscalers talking about that performance."
Parallelstore brings enterprise capabilities and managed features to open source Distributed Asynchronous Object Storage (DAOS), which creates a parallel file system on commodity hardware enabled by NVMe storage. Intel previously used DAOS for its now discontinued Optane NVMe storage products.
The parallel file system, which aims for high IOPS and fast read speeds, isn't likely to be used for many traditional enterprise workloads, but will instead target customers looking to maintain high-utilization GPUs for data processing. Google claims the service will outperform Lustre, another parallel distributed file system available in competing clouds such as AWS.
Parallel file systems can work for ML applications that require fast reads of data, Chhabra said, but use cases such as these are still a small and emerging market, hence the private preview release.
"This is more of an optimization architecture," he said.
Google Cloud NetApp Volumes completes a triumvirate for NetApp as the only storage provider with a file storage service managed by Google, AWS and Microsoft Azure respectively.
Steve McDowellAnalyst and founder, NAND Research
NetApp's file storage management tools and connections for OnTap hybrid cloud functionality are now offered as a service managed by Google Cloud. Google's management of the service will tie billing, auditing and other enterprise requirements to a customer's Google account rather than separately with NetApp, according to Google's Derrington. Existing NetApp Cloud Volume Service customers will have the option to transition to NetApp Volumes.
While hyperscalers offer storage services and capabilities, few can provide a comprehensive offering like NetApp's that includes features such as snapshots and deduplication, for example, McDowell said. The interoperability of NetApp in the cloud and on premises also makes hybrid cloud workloads -- which remain among the most popular setups for storage -- easier to manage.
"I am honestly shocked [that] 15 years into this cloud experiment, none of [the hyperscalers] have been doing proper storage," McDowell said. "NetApp said, 'There's a market here, and I'm going to drive it.'"
Tim McCarthy is a journalist from the Merrimack Valley of Massachusetts. He covers cloud and data storage news.