The glut of generative AI products at AWS can only work if customers are sitting on a trove of usable data, leading the public cloud provider to introduce new storage services it sees as vital to AI training.
At AWS re:Invent 2023 this week, AWS expanded its cloud storage offerings with almost a dozen new object, file and block services.
File storage in the AWS cloud gained the majority of these services, including additions to the AWS native version of NetApp's data platform, OnTap. A few colder storage tiers for block and file rounded out the additions, further indicating the vendor's big data play, according to industry analysts.
But it was the additions to its object storage service that took center stage, with AWS CEO Adam Selipsky introducing a new Amazon S3 object storage tier during his opening keynote Tuesday. The new addition brings a higher-performance tier to the traditionally slower storage format.
Selipsky joked that storage "surely can't be an area that's ready for reinvention," but he went on to state that generative AI will need performance gains at every part of the tech stack.
Overall, the additions are incremental improvements to the AWS storage portfolio, said Dave Raffo, an analyst at Futurum Group. Compared with the vendor's headliner products such as Amazon Q, a generative AI assistant and a response to Microsoft's Copilot, these storage offerings aren't going to make any IT administrators transition to AWS.
Instead, the additions are meant to keep existing customers in AWS and help them further grow their data footprint. Supporting massive amounts of customer data to feed AI development is the current marketing play for storage vendors, Raffo said, and that message could once again change in several months.
"The announcements look incremental and are adding features to those [storage services]," he said.
Amazon speedy storage service
Selipsky personally introduced Amazon S3 Express One Zone during his opening keynote.
Selipsky and AWS materials said the service will provide "single-digit millisecond data access" for customer applications using object storage but will be limited to operating in a single availability zone. The service mostly benefits the processing of large amounts of smaller objects, according to the vendor, which can lead to reduced runtime costs for other processing services.
AWS said use cases include AI/machine learning training, media processing and high-performance computing. The service starts at $0.016 per GB, compared with the $0.023 or $0.022 per GB for S3 Standard tiering.
Analyst briefings on the service indicated the Express One Zone uses flash storage, rather than the traditional hard drives commonly used by object storage services, Raffo said.
Amazon S3, a common standard of object storage across cloud services, wasn't used with databases or other applications demanding high performance in the past due to its slower access speeds, said Scott Sinclair, an analyst at TechTarget's Enterprise Strategy Group.
Object storage was considered useful for less frequently accessed data or repository storage. Now, through the advent of cheaper flash memory, enterprises are seeing object storage as a way to maximize available storage and collect massive data sets for AI/ML development, Sinclair said.
Many cloud customers have already built up significant reserves of data ripe for AI training and are unlikely to pay the costs of moving it out, he said.
"Data has gravity," Sinclair said. "Once you get 2 to 4 petabytes in the clouds, it's not going anywhere."
Scott SinclairAnalyst, Enterprise Strategy Group
Effective AI pipelines, at least those looking to maintain some speed for a business, are now looking to become all flash, he added.
"People need flash all throughout their data pipelines," Sinclair said. "It isn't small, fast storage or big, slow storage [anymore]. Bottom line: Everything needs to be fast."
Increased speed for cloud storage still isn't the cutting edge of performance AWS may tout it as, said Marc Staimer, president of Dragon Slayer Consulting.
Object storage isn't the best fit for databases or high-performance computing, which make use of block storage, and on-premises or colocated services can provide sub-millisecond access speeds, he said. Faster object storage can be a benefit to a generative AI learning development, such as expediting retrieval-augmented generation, a learning reinforcement technique.
"This is not a wow," Staimer said. "I'd call it high-performance object storage. If you really want performance, you're not using S3. You'll be using different storage."
Expansion on file
About half of the storage product and capabilities debuting at re:Invent were related to file storage services offered by AWS, including Amazon Elastic File System (EFS) and Amazon FSx for OpenZFS.
Amazon EFS now offers replication failback, enabling replication of changes from backup to primary files systems, and an increase in total IOPS for the service. Amazon FSx for OpenZFS now has the ability to send snapshots from one file storage system in a customer's AWS account to another.
File storage customers in AWS have a new option for colder storage with Amazon EFS Archive tier. The service targets data that customers expect to access about once a quarter and is priced at $0.008 GB per month, compared with the Standard S3 tier at $0.30 or Infrequent Access S3 at $0.016.
For Elastic Block Storage (EBS) customers, the AWS Backup service adds additional snapshot management capabilities with Snapshots Archive. EBS snapshots with retention periods of at least 90 days can be set to automatically move into the colder Snapshots Archive as part of the snapshot's lifecycle.
The AWS Backup service now offers automatic restore test and validation as well, enabling customers to frequently test their recovery process in case of disaster.
File storage OnTap
Amazon FSx for NetApp OnTap, an AWS native file service built on NetApp's technology, also gained new additions including a scale-out file system capability, multiple availability zones for virtual private clouds, and storage management for FlexGroup volumes with the AWS console.
The new OnTap scale-out file system capabilities increase the total storage NetApp's platform can manage within AWS, namely with the addition of scale-out file systems. An OnTap system can now handle up to 1 pebibyte of data, offering a little more space than the traditional petabyte, compared with the 192 tebibytes of a scale-up system.
Other improvements include increased throughput for reads and writes, as well as IOPS increases. The option is limited to a single availability zone, compared with the multiple zones of scale-up systems.
AWS has showcased NetApp as a file system partner in the past few years, as creating and managing file storage systems are challenging, Sinclair said. Considering NetApp's storage legacy, and its business with other public clouds, Sinclair said he believes AWS benefits far more from courting NetApp than vice versa.
"Building a good enterprise file system is incredibly difficult," Sinclair said. "When you're looking at third-party storage options in AWS, having it with an AWS interface is a differentiation."
Tim McCarthy is a journalist living in the North Shore of Massachusetts. He covers cloud and data storage news.