Getty Images/iStockphoto
The AI factory model: What CIOs need to know
The AI experimentation phase is over. Enterprises are now racing to build AI factories that generate revenue, not just consume resources.
Enterprises are facing mounting pressure to deliver value with AI. Experimentation is giving way to scale and hard ROI metrics. Could an AI factory be the model enterprises need to deploy at scale? Many enterprises are betting on it.
Deloitte surveyed 515 leaders of U.S. companies with more than $500 million in annual revenue and found that 70% plan to operate AI factories at scale by 2028.
If CIOs and other technology leaders charged with developing enterprise AI strategy want to embrace AI factories, they need to understand how these models work. How are AI factories different from traditional data centers? What architectural components do you need to build an AI factory? How can you deploy an AI factory?
Then, CIOs need to determine their enterprises' organizational readiness, how the AI factory will fit into existing hybrid or multi-cloud strategy, the total cost of ownership and risk management.
What are AI factories?
AI factories are platforms that include "purpose-built, high-performance infrastructure -- computing power, network and storage -- paired with AI-optimized software and services," according to the Deloitte survey.
Nvidia CEO Jensen Huang spoke about AI factories during his keynote speech at the company's 2024 GTC event. He said, "An AI factory's goal in life is to generate revenue, generate, in this case, intelligence."
Huang clarified the difference between traditional data centers and his idea of an AI factory in an interview with TechCrunch following his 2024 speech. "The last time, data centers went into your company's cost centers and Capex. You think of it as cost. However, a factory is a different thing. It makes money."
Optimized for AI, these factories have much higher power demands than traditional data centers. The global demand for AI data center power is anticipated to reach 68 GW by 2027. In 2022, total global data center capacity was 88 GW, according to RAND. The race is on to meet the infrastructure demands of AI; Global spending on data centers could hit $7 trillion by 2030, according to McKinsey.
AI factory architecture
The architectural layers of an AI factory include:
- Energy. AI factories cannot run without power. Nearly half of the leaders (48%) Deloitte surveyed expect to take a mixed approach to getting that power -- tapping into the grid, using self-built power and using power generation from third parties.
- Hardware. To manage enterprises' growing, more complex workloads, AI factories need hardware such as ASICs, GPUs, NPUs, TPUs and wafer-scale engines, according to IBM.
- Infrastructure. AI infrastructure for enterprises can range significantly in size. An AI factory might have a few GPU clusters or take up an entire campus, according to Deloitte. AI factories need physical buildings to house hardware and facilities power and cooling delivery. They also require systems for storage and orchestration.
- Data and AI models. Data and AI models are the fuel that an AI factory runs on, according to Nvidia. Where is that data? What kind of access control and security is in place? CIOs need to be able to answer those questions about the data layer of AI factories.
- Applications. AI factories power enterprise applications. As those applications gather new data, it is fed back into the factory to help AI continuously learn and improve, according to Nvidia.
AI factory deployment models
Enterprises can deploy AI factories in various ways. Enterprise industry, compliance requirements, AI workload, use cases and budget will help CIOs determine which approach is the best fit.
- On-premises. Enterprises can opt to build their own on-premises AI factories if they have the resources to buy and deploy their own hardware, software and infrastructure in their existing environments (i.e., data centers or private clouds), according to Nvidia's AI Factory Purchasing Guide. Enterprises in highly regulated environments may choose this deployment model to ensure their data privacy and regulatory requirements are met.
- Cloud. Enterprises can rent AI factories from cloud service providers. The vendor is responsible for the hardware, and enterprises use a pay-as-you-go pricing model, according to the AI Factory Purchasing Guide. While this option offers enterprises flexibility and scalability, it gives them limited control over the technology stack. CIOs risk vendor lock-in and data privacy concerns with this option.
- Hybrid. A hybrid AI infrastructure enables enterprises to use both cloud and on-premises resources. CIOs can distribute AI workloads based on varying needs, according to the AI Factory Purchasing Guide.
Planning for an AI factory
CIOs planning for AI factories need to work through several strategic considerations to ensure their enterprises are ready for this model and can adopt it in a way that delivers on its value proposition.
Does my enterprise need an AI factory?
Nvidia argues that every enterprise will need an AI factory. But CIOs need to consider their enterprises' current needs and their AI deployment maturity. "Are you building a factory because you need one or are you building a factory because your vendor is telling you to build one?" asked Adnan Masood, PhD, chief AI architect at UST.
One metric to consider is usage. "If I'm becoming an AI-first type of company, that's heavy volume," said Patrick Anderson, managing director at Protiviti. "If I want speed, I would want to go to something like a factory model."
Many large enterprises pushing AI pilots into production will likely have high token consumption and a compelling case for managing the entire AI lifecycle through an AI factory.
But AI factories aren't necessarily for large companies alone. Marco Bill, senior vice president and CIO of Red Hat, argued that smaller organizations should "not get intimidated that a factory is just for big companies." An AI factory can be large or small, depending on an organization's needs.
Does my enterprise have the organizational readiness for an AI factory?
CIOs need to evaluate their enterprise's readiness before deploying an AI factory, starting with the data. "If your data is not under control, you would get mixed signals and the quality of anything that you would do on scale [wouldn't] work," said Bill.
Enterprises also need human talent to support. The Deloitte survey lists top roles needed for AI and ML operations: data engineers for AI infrastructure; security and compliance specialists; MLOps, AIOps and agent ops engineers; data scientists for AI orchestration; change management experts; energy monitoring specialists; and robotics systems engineers.
"Talent is a big bottleneck for organizations," said Masood. "If you have upskilled people who already have talent, then you are in a good place to utilize that platform. If not, then you can't."
Talent isn't the only component of an enterprise's human readiness. Culture also plays a huge role. The necessary cultural shift can hold organizations back from successfully adopting an AI factory, according to Bill.
"We have people who are super advanced, and we had some people who are probably more on the traditional spectrum, very conservative," he said. "You've got to cross the chasm and get people excited."
What is my enterprise's infrastructure strategy for an AI factory?
CIOs must figure out how to build AI infrastructure for enterprises, whether it is on-premises, in the cloud or using a hybrid approach.
Hardware lead times are an important consideration, according to Masood. "You can put money down, but still your GPUs are not getting deployed in a timely manner," he said. "You need to think about not just buying the GPUs but also your workload. What are the use cases in production going to be...in the first three months or six months?"
CIOs must understand their enterprises' current and future AI workload management requirements. What kind of capacity does an enterprise need today? How will those capacity needs change as the enterprise works through its AI roadmap?
Third-party vendors can help enterprises with infrastructure planning, but developing criteria for vendor selection is critical.
"AI factories are coming out from not just CSPs but from consulting firms, from hardware vendors, from a lot of different places that you probably wouldn't have expected," said Anderson. "They're all going to have different levels of risk and different roles or responsibilities that you will want to negotiate."
What are the cost considerations for an AI factory?
The goal of an AI factory is to make money, per Huang. But CIOs need to understand the cost of AI factory infrastructure and implementation. At this point, this is a challenge.
Bill likened it to the early days of cloud adoption. "Everybody moved into the cloud, and you get surprised, 'Oh, this is really expensive.' Then, you started having solutions to actually manage your spend in the cloud," he said. "I think this will happen exactly the same way in this space."
The hardware, infrastructure, power and necessary human talent and usage will all contribute to the costs of an AI factory. "You have to track cost per token, token per use case, GPU utilization. You have to look at a chargeback per business unit," said Masood.
How will my enterprise approach risk management for the AI factory?
All the same risk considerations discussed from the beginning of the AI boom apply to the rise of AI factories. CIOs should think about data security, safety and outcomes, operational failures, regulatory compliance and cost containment.
CIOs need effective govrrnance to help them surface and manage these risks. "Your governance has to be an operational infrastructure. It's not a PDF file sitting somewhere," said Masood.
Will the AI factory be sustainable?
AI factories need flexibility to serve an enterprise's changing needs. That means CIOs need to move quickly. "You cannot deploy a factory that takes you six months or a year or two years to deploy because by then the technology is outdated," said Bill. "How can you really change your deployment model, your decision making, the whole operation of IT to…deploy a factory in two months or six weeks?"
And then there are power and water considerations. As more data centers come online to support the demand for AI, resource scarcity cannot be ignored. While these issues may eventually be solved, CIOs must consider what happens in the interim. They need to consider the enterprise's energy costs and those of the vendors supporting their AI factories.
"If their power [cost] triples and they can't afford it and they go out of business, what happens to you?" Anderson asked.
Carrie Pallardy is a freelance journalist with experience writing in cybersecurity, technology and healthcare. She currently covers a wide range of issues relevant to today's CIOs and IT leaders.