Getty Images/iStockphoto

News

Storage's role in generative AI

Generative AI continues to create industry buzz. Experts say storage plays a critical role in making the technology run.

Adam Armstrong

By

Adam Armstrong, TechTarget

Published: 09 May 2023

Generative AI is rising in popularity due to a confluence of growth in IT infrastructure, including storage.

Generative AI relies on deep learning, compute and GPU, all of which have matured in the last ten years. It also needs high IOPS storage to provide fast access to large datasets tech vendors have been refining for decades as IT has continued to evolve. Storage tools such as object storage, which can scale for large datasets, and distributed parallel files systems, which provide high-performance, low-latency data processing, have been the backbone of cloud computing and the big data movement.

Now storage is becoming an underlying foundation for AI. Some AI models are small enough to execute in memory, putting more of a spotlight on compute, according to Mike Matchett, an analyst at Small World Big Data, an IT analyst firm. But large language models (LLMs) like ChatGPT require, in some cases, billions of nodes, which is too cost prohibitive to be kept in memory.

"You're not holding [billions of] nodes in memory. The storage becomes a lot more important," Matchett said.

Despite its speed, memory such as RAM is more expensive than storage, according to Steve McDowell, an analyst and founding partner at NAND Research.

"You're always going to be limited by the cost of RAM, and it's always going to be a balance [with storage]," McDowell said.

He said LLMs would need a parallel file system, such as Weka or Panasas, sitting on top of a high-performance scalable storage system, such as Dell's PowerMax, Vast Data's Universal Storage and Pure Storage's FlashBlade.

Storage's role in generative AI

Generative AI can only produce a good outcome after being trained on reams of data, according to Khalid Eidoo, co-founder and CTO of Crater Labs, an AI and machine learning company based in Toronto that works with businesses to solve specific problems using AI. One method Crater employs is a type of generative AI called generative adversarial networks (GANs), which it used to identify potential structural defects in welds when constructing a nuclear power plant.

In this case, the GAN, which uses four different neural networks, produces images that then get reconciled. Out of the hundreds of thousands of images generated, only five or six meet the high quality level needed, Eidoo said.

To support this functionality, Crater needed high-throughput storage that could read and write synchronously and chose Pure Storage's FlashBlade product. "When dealing with generative networks, you're simultaneously reading millions of images to write millions of images," Eidoo said.

GPUs play an important role in generative AI by accelerating the training of models. But when working with millions of images, the GPU buffer quickly fills up and images need to be written quickly to storage, Eidoo said. High-throughput storage can reduce the potential for a data bottleneck.

Flash not necessary, but optimal

High IOPS storage can provide a user experience more like high-performance computing, according to Matchett.

"You can do parallel file systems on a large number of spinning disks in aggregate," Matchett said.

A parallel file system feeds data from the LLMs to the GPUs, like DDN's A3I that combines DDN's Exascalar, parallel file system with NVIDIA's DGX, Matchett said.

A hybrid version of Exascalar could be used for generative AI, but it caches and tiers storage, potentially affecting performance, McDowell said. The GPUs can't sit idle, so the aggregated HDD performance will be cached to SSDs that operate faster than memory.

"[Those] that are serious about large language models, they're buying high-end flash storage," McDowell said.

Flash provides high IOPS in denser footprints and can also provide LLMs with aggregated performance, Eidoo said. It's possible to use millions of HDDs, but footprint matters. Flash storage is denser, higher performing and uses less power than HDDs. Technology that reduces power consumption now will benefit generative AI in the future.

"GPUs use power like there's no tomorrow," Eidoo said.

Cloud vs. on premises

LLMs also need space to train models. Whether that is on premises, in the public cloud or a hybrid of the two depends on the size of the model and the performance and control needed, Matchett said.

If generative AI is used for research, storing LLMs on the cloud is ideal because users can get the scale required without investing in the capex infrastructure. However, Matchett predicts vendors will offer generative AI applications that will become core to their business platforms. For those that are dependent on performance and security, on-premises storage will be key.

"As an enterprise operation, you've got production workloads that are running at some level of continuity, and that can get expensive," Matchett said.

Before choosing Pure Storage, Crater Labs worked with AWS and Google Cloud before moving to a hybrid infrastructure for speed, security and costs. Crater considered NetApp and HPE before choosing Pure.

Now, Crater Labs uses a combination of on premises -- FlashBlade and FlashBlade's built-in connection to an S3 object store bucket, according to Eidoo. Crater generates terabytes of data per week, which is inefficient to store solely on premises. Using the S3 object store lets Crater access images on the cloud for modeling.

"We knew very quickly as we started developing these generative models that the performance we were getting in the cloud wasn't adequate," Eidoo said.

Adam Armstrong is a TechTarget Editorial news writer covering file and block storage hardware and private clouds. He previously worked at StorageReview.com.

Next Steps

45Drives ups the performance to embrace AI

Dig Deeper on Flash memory and storage

Search Disaster Recovery

4 AI incidents that harmed resilience efforts
AI can be a helpful tool when users respect its limitations and verify what it claims to be fact. If not, the impact on the ...
The board-level economics of downtime
Downtime is an organization-wide issue. Leaders who treat resilience as a strategic capability are better positioned to navigate ...
Isolated recovery environments are critical for modern DR
There is no room for error in disaster recovery, especially when it comes to backups. To ensure you’re recovering from a clean ...

Search Data Backup

Treat HIPAA backup rules as infrastructure, not decorations
Healthcare backup systems designed for recovery and retrofitted for HIPAA produce audit gaps. Encryption, access logging and ...
Geopolitics reshape data protection plans
Business and technology leaders are revising their data protection plans as global conflicts challenge current resilience and ...
What zero-trust data protection means for business
Implementing zero trust at the network level isn’t enough in today’s digital landscape, where critical business data is stored ...

Search Data Center

Data gravity and its role in data center efficiency
Data gravity attracts applications to data locations, enhancing performance and reducing costs. This concept is vital for ...
IBM seeks mainframe, data center integration
IBM launched new models for its z17 mainframe series and LinuxOne servers to fit in a data center, at a time when space is at a ...
More VMware customers jumping ship as contracts wind down
Enterprises have explored new virtualization options for years. With VMware contracts ending, now's the time for many to make the...

Search ITOperations

Atlassian Jira Planner joins spec-driven development AI coding trend
As enterprises grapple with tokenomics, Atlassian emphasizes upfront planning to improve downstream efficiency. But optimizing AI...
14 steps to implement IT automation at enterprise scale
IT organizations can't implement automation all at once. But by following proven steps, they can safely scale workflows while ...
Cribl buys CardinalOps for detection engineering, edges into SecOps
Erstwhile Splunk nemesis adds a "SIEM-like" experience, with IP and engineering from CardinalOps folded into its "Bring Your Own ...

Close