The rapid rise of AI highlights the need for powerful and efficient networks dedicated to supporting AI workloads and the data used to train them.

Data centers built for AI workloads have different requirements than their conventional and even high-performance computing (HPC) counterparts. These workloads don't rely solely on legacy server components. Instead, computing and storage hardware should integrate GPUs, data processing units (DPUs) and smartNICs to accelerate AI training and workloads.

Once integrated, networks must stitch these infrastructure components together and handle workloads with different parameters and requirements. Thus, data center and cloud networks designed for AI must adhere to a unique set of conditions.

To support AI data flows, network engineers must meet critical AI workload requirements, such as high throughput and dense port connectivity. To meet these needs, set up data center networks with the right connectivity, protocols, architecture and management tools.

AI workload network requirements AI data flows differ from client-server, hyperconverged infrastructure and other HPC architectures. The three critical requirements for AI networks are the following: Low latency, high network throughput. Half the time spent processing AI workloads occurs in the network. HPC network architectures are built to process thousands of small but simultaneous workloads. By contrast, AI flows are few but massive in size. Horizontally scalable port density. AI training data uses a large number of network-connected GPUs that process data in parallel. As such, the number of network connections can be eight to 16 times the norm of a data center. Rapid transmission between GPUs and storage mandates that the switch fabric be fully meshed with nonblocking ports to provide the best east-west network performance. Elimination of human-caused errors. AI workloads are typically massive in size. Up to 50% of the time spent processing AI training data happens during network transport. GPUs must complete all processing on training data before AI applications can use the resulting information. Any disruption or slowdown -- no matter how minor -- during this process can cause significant delays. The biggest culprit of network outages or degradation is manual configurations. AI infrastructure setups must be resilient and free of human error.