Nabugu - stock.adobe.com
ChatGPT is the hottest tech in the market, with millions of users testing the generative AI system. But running these systems within a data center may require costly upgrades to existing infrastructure.
Reports surfacing earlier this month indicate that just to develop training models and inferencing alone for OpenAI's ChatGPT can require 10,000 Nvidia GPUs and probably more, depending on additional and different types of AI implementations. This would be a steep investment for cloud providers and organizations alike on a technology still in its early days of development and not yet completely reliable.
While the initial investment is pricey, some analysts say it may be just the cost of doing business in a burgeoning market that is expensive to enter but worth the payoff.
"The cost could be outrageous, but that is what it takes to run these large AI or machine learning systems," said Jack Gold, president and principal analyst at J. Gold Associates LLC. "If you have a mission-critical application to train a model on ChatGPT and it costs $1 million, but you can make back $1 billion, it is worth it. The payback could be huge."
At least initially, such investments might make sense for cash-rich companies such as drug companies developing new drugs or the largest gas and oil companies doing exploration, Gold said.
Adoption of ChatGPT has been nothing short of phenomenal. According to a recent investment report released by UBS, the offering racked up 100 million active monthly users as of the end of January, only two months after it officially launched.
Generative AI could affect chip market
While this portends a rosy future for OpenAI as well as the sales of Nvidia's GPUs, high demand could result in a chip shortage, which in turn would drive up the cost of GPUs. One analyst sees the possibility of a shortage happening over time and said second-source providers to Nvidia -- such as AMD and Intel -- will have to step up.
"We have to get [chip] manufacturing in the U.S. to take advantage of products like ChatGPT to ensure its momentum, so this could bode well for firms like Intel," said Dan Newman, principal analyst at Futurum Research and CEO of Broadsuite Media Group.
He added that only a small number of Fortune 500-class companies will be able to support the infrastructure costs to really take advantage of running generative AI systems in-house.
The greater availability of speedy GPUs isn't all that's needed to support offerings like ChatGPT. They must also upgrade other critical components of their infrastructure to run these AI offerings as well.
Dan NewmanPrincipal analyst, Futurum Research
"Users have to build up not just compute, but their networking and power management," Newman said. "The cost of the components being priced now are pretty significant, especially those relating to the amount of power needed to handle the required performance.
"You need the ability to continue offering higher performance per watt along with lower power consumption," he said.
When asked what technical requirements are needed to host ChatGPT in a data center, as well as develop AI-based applications, ChatGPT said users need a server or virtual machine with at least 16 GB of RAM, a CPU with at least four cores, and a modern GPU with at least 8 GB memory.
They will also need to install additional software and libraries, including Python, TensorFlow or PyTorch, along with the Hugging Face transformers library, which is used to load and fine-tune pretrained models.
A more precise list of infrastructure components to run ChatGPT depends on the size and complexity of the model users are working with, as well as the anticipated level of usage.
There are a number of generative AI systems in the works with various infrastructure requirements. Meta today released its own version, aimed at scientific researchers. The Large Language Model Meta AI (LLaMA) is intended for researchers who do not have access to large amounts of infrastructure to study these models. LLaMA will be available in different sizes including versions for 7B, 13B 33B and 65B parameters, Meta said.
The CEO of one chip startup with a product now in development said users need a massive number of GPUs to handle the range of common AI tasks such as training models, inference and high-performance computing.
"The basic problem with GPUs is they are asked to do too much," said Sid Sheth, president and CEO of d-Matrix Corp., a provider of computing platforms for AI inference. "GPUs like Nvidia's are built specifically for the acceleration of high-end graphics, but they carry too much baggage with them."
With Nvidia's parallel processing and programming model, CUDA, d-Matrix retargeted its GPU toward AI computing applications because AI computing looks a lot like graphics processing, according to Sheth.
The offering d-Matrix is developing involves multiple chiplets on a board that can be slid into some existing and new servers, with each chiplet focused on specific applications such as inference.
"With inference, for example, you need a lot of efficiency; it is all about dollars per inference and power efficiency," Sheth said. "Latency occurs with the newer AI workloads because they are all about interactive experiences. Generative AI like ChatGPT are very interactive, so latency is a key metric. There's no way a general GPU can service so many use cases," he said.
Earlier this week, Nvidia CEO Jensen Huang said his company figures to benefit significantly from the arrival of generative AI, singling out OpenAI's ChatGPT.
The company reported revenue from its data center business, which include a line of GPUs for AI-based workloads, increased during the fourth quarter, indicating a growing interest among users.
"Generative AI's versatility and capability has triggered a sense of urgency at enterprises around the world to develop and deploy AI strategies," Huang told financial analysts during the company's quarterly meeting. "AI is at an inflection point, pushing businesses of all sizes to buy Nvidia chips to develop machine learning software."
As Editor At Large with TechTarget's News Group, Ed Scannell is responsible for writing and reporting breaking news, news analysis and features focused on technology issues and trends affecting corporate IT professionals.