Guido Vrola - Fotolia

Enterprises struggle to learn Microsoft Sonic networking

Problems with Microsoft Sonic include a lack of management tools, weak support from hardware vendors and a steep learning curve for engineers, companies said at the OCP Virtual Summit.

Enterprises learning how to use Microsoft Sonic in a production environment often struggle with the lack of management tools for the open source network operating system.

Other challenges revealed this week during a panel discussion at the OCP Virtual Summit included weak support for Sonic hardware. Also, the panelists said engineers had to work hard to understand how to operate the software.

The companies that participated in the discussion included Target, eBay, T-Mobile, Comcast and Criteo. All of them plan to eventually make Sonic their primary network operating system in the data center.

In general, they are seeking vendor independence and more control over the development and direction of their networks. They expected to achieve network automation similar to Sonic customers Facebook and Microsoft, which built Sonic and gave it to the Open Compute project (OCP) for further development.

Challenges with Microsoft Sonic

Target is at the tail end of its evaluation of Sonic. The retailer plans to use it to power a single super-spine within a data center fabric, said Pablo Espinosa, vice president of engineering. The company plans to put a small percentage of a production workload on the network operating system (NOS) in the next quarter.

Eventually, Target wants to use Sonic to provide network connectivity to hundreds of microservices running on cloud computing environments. Target has virtualized almost 85% of its data centers to support cloud computing.

Target's engineers have experience in writing enterprise software but not code to run on a NOS. Therefore, the learning curve has been steep, Espinosa said. "We're still building this muscle."

As a result, Target has turned to consultants to develop enterprise features for Sonic and take it through hardware testing, regression testing and more, Espinosa said.

Online advertising company Criteo was the only panel participant to have Sonic in production. The company is using the NOS on the spine and super-spine level in one of nine network fabrics, engineering manager Thomas Soupault said. The system has 64 network devices serving 3,000 servers.

Also, the company is building a 400 Gb Ethernet data center fabric in Japan that will run only Sonic. The network will eventually provide connectivity to 10,000 servers.

One of Criteo's most significant problems is getting support for low-level issues in the open hardware running the NOS. Manufacturers won't support any software unless required to in the contract.

Therefore, companies should expect difficult negotiations over support for drivers, the software development kit for the ASIC, and the ASIC itself. Other areas of contention include the switch abstraction interface that comes with the device for loading the buyer's NOS of choice, Soupault said.

"It can be tricky," he said. "When we asked all these questions to manufacturers, we got some good answers, and some very bad answers, too."

Soupault stopped short of blaming manufacturers. Buyers and vendors are still struggling with the support model for Sonic. "If we could clarify this area, it might help others on Sonic" and boost adoption, he said.

Network management tools for Sonic are also in their infancy. Within eBay, developers are building agents and processes on the hardware for detecting problems with links and optics, said Parantap Lahiri, vice president of data center engineering at the online marketplace. However, discovering the problems is only the first step -- eBay is still working on tools for identifying the root cause of problems.

We hope that the community will come together to build the tools and make the product easier to manage [through] more visibility for the operations teams.
Yiu LeeVice president of network architecture, Comcast

Comcast is developing a repository for streaming network telemetry that network monitoring tools could analyze to pinpoint problems, said Yiu Lee, the company's vice president of network architecture. However, Comcast could use help from OCP members.

"We hope that the community will come together to build the tools and make the product easier to manage [through] more visibility for the operations teams," he said.

Some startups are trying to fill the void. Network automation startup Apstra announced at the summit support for Sonic-powered leaf, spine and super-spine switches.

Going slowly with Microsoft Sonic

The panelists advised companies that want to use Sonic to start with a low-risk deployment with a clearly defined use case. They also recommended choosing engineers who are willing to learn different methods for operating a network.

Lahiri from eBay suggested that companies initially deploy Sonic on a single spine within a group. That would provide enough redundancy to overcome a Sonic failure.

Soupault advised designing a network architecture around Sonic. Criteo is using the NOS in an environment similar to that of Facebook and Microsoft, he said. "Our use case is very close to what Sonic has been built for."

A company that wants to use the NOS also should be prepared to funnel the money saved from licensing into the hiring of people with the right skill sets, which should include understanding Linux.

Microsoft built Sonic on the open source operating system used mostly in servers. So, engineers have to know how to manage a Linux system and the containers inside it, Lahiri said.

Next Steps

4 trends spurring the evolution of network hardware

SONiC NOS has no clear path to mainstream data centers

Dig Deeper on Cloud and data center networking

Unified Communications
Mobile Computing
Data Center