KOHb - Getty Images
Listen to this article. This audio was generated by AI.
The power demands of generative AI are poised to overwhelm data center capacity, forcing most AI providers to spread processing across the edge and enterprise devices.
Industry data shows that cloud providers and colocation companies building and expanding data centers for GenAI will have to work with organizations developing AI models for edge facilities, corporate PCs, smartphones and IoT devices.
Most experts agree there isn't enough unused electricity to perform future AI processing in only hyperscale data centers.
"The demand for AI is voracious right now," said Jacob Albers, head of alternative insights at commercial real estate services firm Cushman & Wakefield. "And all the major cloud providers and everyone in the data center space is racing to catch up."
Data center providers and hyperscalers signing power leases for 36 megawatts -- rare five years ago -- is commonplace today, along with 72 MW leases, according to intelligence provider Datacenterhawk. Leases of 100 MW are no longer unheard of.
Increasing power demand driven by anticipated growth in GenAI has left data center operators and hyperscalers clamoring for electricity in any region with land for construction and sufficient network bandwidth.
"There's a big scramble for power in the industry," Applied Digital CEO Wes Cummins said. "People are looking for power everywhere."
Applied Digital, a data center provider for AI and high-performance computing (HPC), has facilities in North Dakota and plans to add one in Utah next year. Its strategy is to find renewable power -- wind, hydroelectric or solar -- built with funds from the landmark climate bill President Joe Biden signed last year.
A significant amount of that power lacks the infrastructure to connect to the grid, according to recent reporting by The New York Times. Cummins wants to tap into it provided there's sufficient network bandwidth close by.
DataBank, which has 74 data centers spread across 29 metro markets, plans to build only high-density facilities for HPC, including GenAI, said Eric Swartz, vice president of engineering. Those data centers regularly provide 30W and 40W per server rack installed by customers. That's roughly double the power of traditional server racks.
"Probably the biggest factor at this point is, 'Will there be power?'" Swartz said. "The people that are going to be on top are the people that can secure power."
AI-driven compute demands have slowly increased over the last nine to 12 months, said Barry Buck, marketing leader at data center builder DPR Construction. Current infrastructure has met customers' power needs, but DPR sees companies having to build outside primary data center locations.
"Lack of available power is already an issue in the primary geographies for traditional computing needs. And we're seeing customers build in secondary and tertiary geographies, where power is more readily available," Buck said.
Shrinking models for the edge
Jim McGregor, an analyst and partner at Tirias Research, doesn't see how power suppliers will meet the needs of AI-focused data center infrastructure while supplying electricity to consumers and businesses. Tirias predicts that at the current deployment rate, GenAI infrastructure costs will exceed $76 billion a year by 2028 and consume 4.25 gigawatts, roughly half of the projected power used by U.S. data centers this year.
"The total cost of ownership is just huge," McGregor said. "[However,] every time we run into a technical challenge like this, there's a change in business model that overcomes it."
The tech industry's response to limited power is to build and train GenAI large language models in AI data centers and distribute models a fraction of the size just for inferencing, crunching real-time data for answering queries or performing tasks.
Albers said data center providers and hyperscalers are launching facilities from 100 KW to 5 MW for running inferencing models. DatacenterHawk reports seeing edge buildouts from 1 MW to 5 MW for AI workloads in markets such as New Jersey, Houston and Minneapolis.
Organizations building small language models capable of running at the edge include Abacus.AI, Cerebras and MosaicML, owned by data storage and management startup Databricks. In March, research group Large Model Systems Organization introduced Vicuna-13B, an open source chatbot trained on 13 billion parameters, the historical data used to train AI models.
Models trained on fewer than 50 billion parameters can run on many edge devices and provide the best chance of performing GenAI processing, McGregor said.
"I think the best thing we can do today is optimize and shrink the models as much as we can to offload them from data centers," he said.
PC chipmakers are preparing to perform GenAI processing on corporate computers to provide quicker responses to queries. AMD and Intel recently introduced x86 CPUs, and last week Qualcomm launched its Arm-based Snapdragon X Elite for AI inferencing on PCs, set for release in the middle of next year.
Reuters reported last week that Nvidia planned to launch an Arm-based CPU to run Microsoft's Windows operating system, which includes the software maker's Copilot AI assistant.
Enterprises prepping for AI
Wes CumminsCEO, Applied Digital
Today, enterprises mainly test various GenAI services and technologies to determine their value and impact on data privacy and security. However, Gartner predicts that the percentage of enterprises using GenAI APIs, models or applications in production will soar from roughly 5% this year to more than 80% by 2026.
IDC expects enterprise spending on generative AI software, infrastructure hardware and IT services to grow from nearly $16 billion this year to $143 billion in 2027. That's twice the rate of overall AI spending and 13 times greater than the annual growth rate for global IT spending.
Such a huge adoption rate will shake up the data center market and spawn new architectures stretching across the edge, IoT devices, PCs and even smartphones.
"We know change is coming. We just haven't seen a ton of it yet," Buck said.
Antone Gonsalves is editor at large for TechTarget Editorial. He has deep and wide experience in tech journalism. Since the mid-1990s, he has worked for UBM's InformationWeek, TechWeb and Computer Reseller News. He has also written for Ziff Davis' PC Week, IDG's CSOonline and IBTMedia's CruxialCIO, and rounded all of that out by covering startups for Bloomberg News. He started his journalism career at United Press International, working as a reporter and editor in California, Texas, Kansas and Florida. Have a news tip? Please drop him an email.