sakkmesterke - stock.adobe.com
How to optimize networks for AI workloads in the cloud
An AI investment could be leaking money through poor cloud networking. Discover how IT leaders cut costs, boost speed and gain a competitive advantage with strategic optimization.
At first glance, cloud networking may not seem to exert a major impact on AI workloads. After all, cloud networks are typically quite fast and reliable. IT leaders might plausibly assume that AI performance and cost optimization have much more to do with factors like model design and training processes than with cloud network configuration.
But in actuality, cloud networks can play a significant role in the effectiveness of AI deployments -- which is why businesses seeking to make the most of AI must carefully plan their cloud networking architectures and configurations. 92% of IT executives, surveyed for "State of AI for Networking 2026" by Extreme Networks, report that AI has increased computing and bandwidth demands, which underscores how AI adoption strains existing infrastructure and drives a need for more scalable, cloud-based networks.
Find out how cloud networks impact AI workloads, what the main challenges are and strategies to optimize cloud networks for AI.
How do cloud networks impact AI workloads?
Cloud networking impacts AI in a variety of ways, including:
- Performance. To enable real-time decision-making -- a common goal for AI use cases -- data needs to move across networks with minimal latency.
- Reliability. Networking issues, such as limited bandwidth or dropped packets, can cause AI workloads to fail because they can’t reliably access the data they need.
- Cost: Moving data across networks comes at a cost. This is especially true when data leaves cloud environments, resulting in egress charges from cloud providers.
- Scalability: Network bandwidth and performance limitations can place limitations on AI workload scalability. For instance, the number of prompts that a cloud-based model can process per second during inference depends on how quickly cloud networks can transfer prompts between users and models.
In short, for AI models that are hosted in the cloud, and/or that send or receive data from cloud environments, cloud networks can easily become the weakest link in overall model performance, reliability, cost and scalability.
What are cloud networking challenges for AI workloads?
On the whole, cloud networks are generally reliable. They tend to perform better and experience fewer failures than networks that organizations deploy inside their own data centers or using on-prem infrastructure.
But that doesn't mean that businesses can simply settle for the default network configuration offered by their cloud provider(s) and call it a day. On the contrary, there are several challenges to consider for AI workloads that are specific to cloud networking, such as:
- Multi-cloud networking delays. Cloud networks tend to perform very well when moving data within the same cloud. But for organizations that use a multi-cloud architecture, latency delays and bandwidth limitations when moving data between distinct clouds can become a bottleneck for AI performance.
- Egress charges. As mentioned above, cloud providers charge for egress in most cases , meaning the transfer of data to a location outside their cloud platforms. This can lead to challenges associated with AI cost management because if models are constantly transferring data between clouds, egress bills will quickly add up.
- Access controls. Securing cloud networks requires deploying access control policies that restrict how resources can interact. Configuration oversights, however, may cause AI performance issues. For example, they might prevent two AI agents from communicating with each other.
- Observability. Monitoring and observing network performance can be challenging in any context. It's especially complicated when deploying AI workloads in the cloud, due to high data transfer volumes and the complexity of cloud networking architectures.
How do I optimize cloud networks for AI?
Here are some best practices businesses should consider to get the most out of cloud networks that help integrate their AI workloads with each other and with other parts of their IT estates.
VPCs
Virtual private clouds (VPCs) are a type of cloud networking resource that isolates workloads at the network level. They can help improve AI security without compromising performance. For example, by deploying an AI model within a VPC, a business can more easily restrict which human users and software services are able to connect to the model. VPCs can also assist with AI security practices like prompt filtering because they can pass all prompts and responses through a central endpoint where filtering can take place.
Agent meshes
Agent meshes are an emerging technology whose main purpose is to integrate AI agents. In most cases, they work by routing communications between agents and AI models through a central hub, where data can be filtered, transformed, blocked and so on. This eliminates the need to perform these tasks within each individual agent.
From a cloud networking perspective, agent meshes can do much to boost performance and mitigate security issues. They could, for example, strip unnecessary data from requests that an agent sends to a model. This would reduce the amount of data that needs to travel over the network, which would in turn improve performance and (if the data is moving between cloud environments) reduce egress costs.
Edge computing for AI
Edge computing means the deployment of workloads close to end users rather than in central cloud data centers. In the context of AI workloads -- especially those that need to respond to user requests in real time -- edge computing can be a powerful way to boost performance by minimizing the distance data needs to travel.
Edge computing for AI also presents a variety of challenges, including the need to deploy infrastructure at the edge that can host compute-hungry AI models. But from a networking perspective, the performance gains can be substantial.
Interconnection
Networking interconnects are dedicated networks that run between two or more specific sites -- such as two different public clouds, or between a public cloud and a private data center.
Because interconnects enable businesses to transfer data over dedicated infrastructure rather than using the "generic" internet, they can dramatically boost performance. They are another way to speed AI performance, especially for real-time workloads.
Chris Tozzi is a freelance writer, research adviser, and professor of IT and society who has previously worked as a journalist and Linux systems administrator.