Qualcomm, OpenAI, IBM target AI infrastructure efficiency
An acquisition, a planned chip and a new transistor design could mitigate AI's energy and cost issues, but the effects will take time to reach enterprise buyers.
Three major AI infrastructure vendors advanced plans this week to build more efficient AI data centers amid rising costs and energy-efficiency concerns in the industry, but it remains to be seen when cost savings will materialize for downstream customers.
AI data centers are constrained by enormous demands for power and water resources and subject to increasing scrutiny from surrounding communities. Tokenomics is also taking center stage as a major blocker to enterprise AI development. This week's acquisition of Modular by Qualcomm, OpenAI and Broadcom’s partnership to build a new AI inference chip called Jalapeño, and a claimed breakthrough in chip transistor density from IBM are all bids to address these issues.
Jalapeño, for example, OpenAI's first foray into processor hardware, "will deliver performance per watt substantially better than current state-of-the-art," according to a company blog post.
IBM, meanwhile, estimated that its new nanostack technology will deliver a 70% improvement in energy efficiency over its most advanced chips. An estimated 40% increase in sequential RAM (SRAM) density could help to alleviate the global memory shortage, according to IBM.
"There is a common thread of power efficiency for AI workloads" in all of these updates, said Patrick Moorhead, founder, CEO and chief analyst at Moor Insights & Strategy. "Qualcomm racks operate at 200 megawatts versus the latest and greatest GPU racks that are in the gigawatt range. … It's hard to say right now without exact specifics, but [Jalapeño] will likely enable lower-cost AI services."
Qualcomm prepares to challenge Nvidia
Qualcomm, which rolled out new data center infrastructure this week with its Dragonfly line of GPUs, CPUs and accelerators, added a software layer with its approximately $4 billion acquisition of startup Modular.
With AI budgets running out of control, this gives organizations a safety valve to redirect the fast-growing demand for inference workloads to the least expensive destination.
Larry Carvalho,Principal consultant, Robust Cloud
This "enables Qualcomm Technologies to deliver a silicon-agnostic compute layer across devices, edge and data centers, improving performance-per-watt, increasing hardware flexibility, and expanding an open developer ecosystem so customers can deploy AI more efficiently across heterogeneous platforms globally," according to a company statement.
Modular's software could provide portability of workloads among data centers, remote and edge locations, and different types of processors, which has the potential to disrupt AI infrastructure economics for enterprises, said Larry Carvalho, principal consultant at RobustCloud.
"With AI budgets running out of control, this gives organizations a safety valve to redirect the fast-growing demand for inference workloads to the least expensive destination," Carvalho said.
Qualcomm and Modular could also represent a strong challenger to Nvidia's CUDA, which has had a "stranglehold" on the AI infrastructure market to date, according to industry analysts. This could potentially introduce pricing pressures on Nvidia and help lower costs for enterprise buyers.
"Nvidia still rules the roost, but it will take time to see whether such alternatives have a measurable effect on Nvidia's revenue," Carvalho said.
Jalapeño: adding real savings or IPO spice?
Jalapeño will reach initial deployment by the end of 2026, but likely won't be available widely until 2028, Moorhead estimated.
"In the short term, it won't help" with AI infrastructure costs, Moorhead said.
One analyst attributed Jalapeño's origins to OpenAI's impending IPO, rather than to cost changes within reach for enterprises.
"Does this put any significant pressure on Nvidia, or will Nvidia have caught up with the Jalapeño performance gains by the time of mass production?" wrote Torsten Volk, an analyst at Omdia, a division of Informa TechTarget, in a LinkedIn post this week. "At the end of the day, all of these silicon efforts seem critical for OpenAI … to fix their token-economics problems and ensure an IPO can actually be successful."
IBM's nanostack: broad implications but a long wait
IBM's new sub-nanometer transistor design is even further away from practical applications, though it has broad potential to change the density of a variety of silicon systems, from CPUs and GPUs to flash memory and more.
The new design stacks sheets of tiny transistors, called nanosheets, that IBM rolled out in 2021 with its previous two-nanometer chip design. The nanostack architecture vertically stacks and staggers nanosheets to pack "nearly 100 billion of these transistors onto a chip about the size of a fingernail," said Jay Gambetta, director of IBM Research, on a press briefing introducing the new architecture this week.
A researcher holds a sub-nanometer node chip, a design that IBM Research unveiled this week.
"That's twice the density of the two-nanometer chip, which itself marked a major leap forward," Gambetta said. "All this matters because semiconductors are the foundation of the modern life, powering everything from AI systems to cloud infrastructure to the devices, networks and critical systems that society and business depend on every day. "
This breakthrough shows that rumors of the death of Moore's Law have been greatly exaggerated, Moorhead said.
"Every five to ten years, there's this notion that chips and chip architectures have run out of gas, we can't make them any smaller, we can't make them any higher performance, and then new technologies come out that make that possible," he said. "There have been others that have shown off [similar designs], but IBM brought out a wafer, and I've never seen that before. And they also brought out a process development kit, which means you can actually go out and create a chip with this technology, so I would say they've delivered the most proof."
However, IBM's two-nanometer chip designs are just entering volume manufacturing, and IBM officials estimated the new design won't reach mass production until 2030.
Even that seems like a generous timeframe, said Stephen Sopko, an analyst at HyperFrame Research.
"Executing this kind of sequential nanosheet stacking at volume is another order of magnitude harder than the industry's already-challenging transition to [the previous two-nanometer design]," Sopko said. "I view it as directionally encouraging, but I'd watch execution closely, especially manufacturing scalability and real-world yields. A five-year timeline to production strikes me as aggressive."
Beth Pariseau, senior news writer for Informa TechTarget, is an award-winning veteran of IT journalism. Have a tip? Email her or connect on LinkedIn.