Cisco Silicon One touts efficiency breakthrough with AI chip
Cisco claims its Silicon One P200 chip can replace larger, more power-hungry data center interconnects, as AI development pushes the limits of power grids.
A new Cisco Silicon One chip and AI routing system will mainly be used by hyperscaler customers for now, but experts believe they could play a key role in the enterprise AI infrastructure of the future.
Cisco's Silicon One, first launched in 2019, is a high-scale networking chip architecture that can be packaged into multiple network devices for various roles in data centers, ranging from low-end converged access and edge routers to core data center switches. This week, Cisco Silicon One began shipping its latest P200 AI networking chip and a corresponding set of 8223 series routers to hyperscaler customers, including Alibaba and Microsoft Azure.
The P200 and corresponding router systems are built for scale-across networking, a term popularized by Nvidia with the launch of its Spectrum-XGS Ethernet platform in August, designed to scale AI networks beyond the limits of a single data center. Cloud hyperscalers and AI frontier model developers alike have been hitting these limits, and the limits of a single city's power grid, as their demands for GPU-based compute power explode, said Sameh Boujelbene, an analyst at Dell'Oro Group.
"This is one of the biggest pain points in the industry," Boujelbene said. "The power budget in a given data center, and sometimes even in a given city, is very limited and cannot really connect hundreds of thousands of GPUs … so [hyperscalers] need to distribute that AI cluster across multiple data centers, across cities and sometimes even among multiple cities."
Nvidia's Spectrum-XGS comprises its Spectrum-X ethernet switches, its latest Nvidia ConnectX-8 SuperNICs, which support 800 Gbps throughput, and software algorithms that dynamically adapt distance congestion control and latency management to the distance between multiple data center facilities. This enables them to operate as a geographically distributed GPU cluster. Nvidia partner CoreWeave uses the Spectrum-XGS system.
While Nvidia touted the system's performance over off-the-shelf internet hardware at launch, it uses chips similar to what it also deploys inside data centers, Boujelbene said. Cisco's Silicon One P200 is more akin to Broadcom's high-end Jericho 4 family of networking chips used by Arista and Juniper in high-end data center routing systems, which it began sampling with customers in August. Jericho 4 slots into Broadcom's Tomahawk network switch products, and the Tomahawk Ultra supports up to 52.1 Tbps connectivity on a single chip.
Cisco claims its new P200 chip packs the processing power of 92 of its previous chips into one, reducing its 8223 routing system's power consumption by 65%.
Cisco touts AI network efficiencies in silicon and beyond
Cisco owns the AI chip along with the switches and routing systems in the Silicon One product line, which it redesigned to further optimize power efficiency. According to a company blog post, this resulted in "routing capabilities with switching bandwidth and efficiency" in a single fixed three rack unit (RU) chassis. The 8223 system and P200 offer 51.2 Tbps throughput, comparable to its previous Cisco 8804 system released in 2023, which comprised a 10RU chassis with 92 chips that drew 65% more power.
Cisco claims that its packet processing inside Silicon One is unique in the industry, with a fully shared deep buffer. Deep buffers aren't new, but they have gotten a bad rap in some parts of the industry as too slow and inefficient to support AI workloads, according to Rakesh Chopra, senior vice president and fellow at Cisco, during a press briefing presentation this week.
Everything has shrunk way down -- we can drive our fan power down; we can save on power conversion … we're chasing out every single watt in that system … because that is the fundamental limitation in the industry.
Rakesh ChopraSVP & Fellow, Cisco
"[There's a] belief that deep buffers slow down AI workloads -- if you dig into the details, it's actually not true," he said. "Because of poor congestion management, we get congestion in the network, and those buffers absorb that. So, making use of the predictability of AI workloads and proactive congestion control, you can solve that, but deep buffers are still needed to deal with failures that occur in the network -- and that, at this scale, is the norm, not an exception."
Advancements in the performance of the P200 also help speed deep buffering, and a single shared deep buffer in the new 8000 system drives further power efficiencies, Chopra said.
"We actually don't move packet data around this fully shared packet buffer -- we write packets once, we read them once, and all we're doing is manipulating descriptors," he said. "Once you've accomplished all of that, there's a compounding effect, because everything has shrunk way down -- we can drive our fan power down; we can save on power conversion … we're chasing out every single watt in that system … because that is the fundamental limitation in the industry."
Today, hyperscalers … tomorrow, enterprise AI?
Enterprises are still well behind frontier model developers and cloud hyperscalers in pushing the limits of AI networks and data centers, Boujelbene said.
She said there could potentially be a trickle-down effect if hyperscalers can use scale-across AI networks to put data centers in regions where power is cheaper while maintaining cluster coherence among them.
Boujelbene added that certain early-adopter segments in enterprise AI, such as healthcare and bioscience companies and national governments, might require scale-across systems sooner than most.
"Every government looks at AI as a very, very big differentiator, and it's a matter of life or death," she said. "Governments are racing, not only to build their AI infrastructure, but also to gain control over it. Who's going to deploy that and where is a major topic, and Cisco is very active in government projects not only in the U.S., but in the Middle East and in Europe."
In the long term, the modular design of Cisco Silicon One systems could serve as an entry point for enterprises as their AI infrastructures expand, said Matthew Kimball, an analyst at Moor Insights & Strategy, in an email to Informa TechTarget.
"AI is going to cause many enterprise organizations to be hyperscale-like in behavior, laying down the fastest and richest networking fabric and interconnect to drive agentic AI at scale," Kimball wrote. "Silicon One P200 is a single architecture with five [device profiles] that span different deployment requirements. The consistency of architecture means this hyperscale-like requirement of AI can be served to the enterprise."
Beth Pariseau, a senior news writer for Informa TechTarget, is an award-winning veteran of IT journalism covering DevOps. Have a tip? Email her or reach out @PariseauTT.