SuperWomen in Flash Leadership Award Virtual FMS Show Awards
Guest Post

The New Face of High-Speed Interfaces

CXL is a CPU-to-device interconnect protocol that targets high-performance workloads. Here, you will find an introduction to the CXL specification. Explore the latest developments, use cases and more.

Download the presentation: The New Face of High-Speed Interfaces

00:00 Kurt Lender: OK. Hi, I'd like to welcome you to FMS 2020. I'm Kurt Lender, the co-chair of the CXL Marketing Work Group. I'm also a senior ecosystem manager at Intel Corporation. And I'm here with Siamak Tavallaei and we are going to introduce the CXL specification. And first, Siamak, can you introduce yourself?

00:26 Siamak Tavallaei: Thank you very much, Kurt. This is Siamak Tavallaei. I'm the co-chair of the CXL Technical Task Force, also a principal architect with Microsoft Azure.

00:39 KL: OK, thank you. So, like I said, we're going to introduce the 2.0 specification at a very high level. You'll see a lot more detail coming out from the marketing workgroup and other . . . We'll have a webinar actually in December, December 10th is the date now. We'll be posting a white paper on CXL 2.0 specification. So, again, this is a high level, and what I will do is again, talk about CXL in general, some of the market needs and why we did CXL. And then Siamak will go into some of the details on the specification itself.

So, the first thing is what drove the need for CXL, and that's this industry landscape that you're seeing. We see the proliferation of cloud computing today -- everything is being moved to the cloud. Companies are putting more and more of their applications out there. Data being analyzed right and left, growth of AI and analytics, so there's more need for data movement, data storage in that sense.

01:44 KL: And then cloudification of the networking edge, they're actually moving out to the cloud too. So, again, this proliferation of the cloud, it's just moving everywhere. With this, there is some growing demands basically. All these needs basically are driving demand for faster processing. So, with CXL, we've latched on to the PCIe specification. We're on PCIe 5 today and we'll certainly . . . We're monitoring the 6.0 evolution that PCIe is looking at now. So, we'll continue to drive faster processing both through the bus and PCIe, and then the latencies and things that CXL brings to the table. The other thing that all these that you're seeing in this environment is the need for heterogeneous processing.

02:42 KL: There are different deployment models that are growing. You see obviously CPUs like from my company, Intel. You have FPGAs from Intels, from Xilinx of the world that have custom A6 being produced, and all these are being put into the same solutions. They're basically competitive solutions that need to work together. So, this need for heterogeneous computing, the computing really tuned to the application and optimized to the application is growing.

The other thing that the landscape is seeing is really the need for increased memory capacity and bandwidth. And again, with CXL we have the .mem capability, and you'll see as Siamak talks about 2.0, the memory space is one of the areas where we definitely configure usage models into the CXL 2.0 specification.

And then the last is with the growth of memory you're seeing different memory tiers. So, again, it's one of the other added things for CXL 2.0 is some of the persistent memory hooks that we've put in.

04:01 KL: So, the response to the needs was CXL itself, and like I said it's been one year since we incorporated with one of the strongest boards, board of directors in the industry. We have the leading cloud providers, we have leading OEMs, we have all four CPU vendors. We really have the industry leaders and influencers in the CXL Consortium. And again, we're seeing the excitement grow more and more. We're up to 130 members today and that's really at adopter, contributor or promoter level.

We did open up the 1.1 specification, it is public. We're going to do the same for 2.0 also. The difference is adopters class get . . . As you implement, you get the IP protection rights from the CXL Consortium, so we do recommend that you come in at certainly adopter level or we hope contributor level, which again you can join. We are up to five different technical work groups. You can contribute to those workgroups as a contributor and influence the 3.0 and beyond specifications for CXL.

And the last, of course, this is open. I've already said it's going to be totally open, but again, this is open to the industry and being driven by the industry.

05:29 KL: The other thing I'll mention on CXL 2.0 is that we are going to . . . It will be backwards-compatible with 1.1, so there is that reuse and bringing forward of technology for that matter. You don't throw away previous revisions of your implementations. So, I've already mentioned the left slide here, the challenges, the industry need for faster processing and. for the next-gen data centers, the heterogeneous computing, the increased need for memory capacity and bandwidth. And really, we need an . . . At this time last year, we really didn't have a unified specification, open specification. But like I said, the excitement around CXL, I do believe that the industry really is coalescing around CXL.

CXL brings to the table, and the features it brings, are the three listed on the right here. It's a coherent interface that really mixes and matches through protocols. .Io, that's really a packet size version of PCIe. And then the two others are the .cache and .mem. And that's the two that bring the coherency to the memory between accelerators or CXL-attached devices, I should say, and root complexes.

06:54 KL: And all this is done . . . The .cache and .mem are done with low latency basically designed in. And low latency, I mean like cache coherent-type levels of latency, so very low latency. Asymmetric complexity is the last feature, major feature of 1.1, and that's where the burden really is put on the root complex, and the endpoints can migrate from generation to generation. They can actually migrate from different CPU vendors, for that matter. And again, there's reuse of . . . Basically designs moving from one gen to another.

And then last, this is the usage models. We basically call them Type 1, 2 and 3. And this is where you can see that there is a mix-and-match nature of the protocols. .Io is always there, and that's for enumerating the system and basically setting it up, but then you get into the different types.

08:18 KL: Type 1 is where you have some CXL-attached device, like a NIC or something like that, and it can share the processor memory. The type 2 is where you have the .memory protocol, and that's where that CXL-attached device has memory. The CPU can share it also. So, again, there's a wider range and growth of the memory space here. And then the last one is where the CPU actually can just add memory to its system, and that's with the .mem protocols. And again, this is one of the areas where we've expanded significantly for CXL 2.0. So, I'll now hand it over to Siamak to talk about some of the features, key features, that we've introduced with CXL 2.0.

09:06 ST: Thank you very much, Kurt. As Kurt suggested, point-to-point devices, the processor connecting to end devices and providing low latency, high-bandwidth interconnect for devices that could benefit from load-store semantics, that is on top of the, which is very similar to PCIe for moving block mode operations, DMA style. The .mem and .cache provide for low-latency interconnect for caching devices or smart devices. So, as Kurt suggested, the physical layer running on PCIe Gen 5, providing 32 Gb per transfer was good enough for these end-to-end devices, point-to-point devices. But then people also asked for fanning out, so that one root port can address multiple end devices. CXL 2.0 addressed that by introducing one layer of switching, so devices underneath each switch could still be caching devices or could be memory-type devices. One layer of switching still provides a very large fan out. Each host may have multiple CXL links, each switch can have multiple subordinate links, and therefore very many devices can be connected to one host.

11:03 ST: A second major feature that could be enabled using CXL 2.0 is the fact that CXL 2.0 switches can be multihost-capable by specification. Diagram to the left is showing a CXL switch connected to multiple CPUs, multiple hosts and multiple devices down below. And the example, host-1 connects to device-2 and device-3 and forms one major hierarchy, whereas, for example, host-3 is connected to device-4. That capability is enabled using CXL Fabric Manager that runs in conjunction with the CXL 2.0 switch. That is in . . . CXL Fabric Manager is in charge of assigning hosts to devices to end devices. An enhanced version of this, or a more capable version of that, is when the end device itself is capable of subdividing itself into multiple logic devices.

12:20 ST: So, an MLD device, multi-logic device, can be programmed to be bound to multiple hosts. Up to 16 hosts are supported within the CXL 2.0 switches. The switch itself can be connected to more than 16 hosts, but each device can be connected to up to 16 hosts. In this example, hierarchy one, host-1 is comprised . . . Comprising host-1, device-1, a portion of device-2, and a portion of device-4. Whereas host-3 in a different hierarchy is using a portion of device-2, a portion of device-3, a portion of device-4, and a portion of device-N in this example. As we described, CXL provides a low-latency, high-bandwidth transport to devices such as DRAM and to devices that are of higher latency such as storage devices. The piece that we added also with CXL 2.0 was specific support for persistent memory. The type of memory, that could look like a storage element, but would like to live with load-store semantics at much smaller latency, for example in tens or hundreds of nanoseconds instead of microseconds.

14:12 ST: To support, that the concept of global persistent flush was also introduced so that individual root ports or individual hosts could command cycles to be flushed all the way to persistent store. Another major feature part of CXL 2.0 was the encryption and security implementation on individual links modelled after IDE as part of PCIe implementation but enhanced for CXL.memory and CXL.cache. The encryption capability are link-based -- when the cycles leave the root port they could be encrypted and when they enter the end device, they get decrypted. Links could include, in this model, routes or links could include switches so the encryption is maintained through the CXL switch as well. Kurt, would you like to bring us back to the summary slide please?

15:38 KL: Thank you, Siamak. So, again, a quick, quick summary of 2.0. And I want to highlight CXL Consortium, the momentum is growing. We've been at this now for a year and we're now at 130 members and growing, so come join us. We're on the second generation of the specification, and I know that the technical task forces or the board groups are starting to work on the next-generation specs.

So, again, I mentioned contributors can get that opportunity, you're not too late, you can join us now and help contribute to that. And, again, CXL really is . . .we're looking into responding to industry need, so that's another spot where I know there's a lot of discussions going on in the sense of, "Where should CXL continue to go?"

16:36 KL: The middle column here, CXL added things like switching, we needed more . . . We wanted to grow the system size, so the expansion phenomena of switching, plus the pooling where you could create these pools of resources, Mary being one of them that you could tap into. Persistent memory support, that again, gives you that tiering inside memory and is something that was requested by the industry. Certainly, in Flash Memory Summit there's a lot of folks doing those types of devices in this show. And then security was certainly a key feature that will be continued to augment all this. We are backwards-compatible with the current route CXL 1.1 and 1.0, that will continue, that is one of the base legs of CXL. We'll continue that moving forward.

17:33 KL: And then the last one that we really didn't talk about here is, there is a chapter in the 2.0 specification on compliance and interoperability. That workgroup is going to roll out their program fairly shortly, in either the latter part of Q4 or Q1, so we don't have that announcement in here, but that will be one of the things to look for moving forward. And then the call to action really is, as I've said, join the CXL Consortium. Adopters, the lowest level gives you the IP rights, but again, contributors and promoters certainly can join the workgroups and contribute in that sense. This is a public specification, so if you're interested in looking at it first, go out on the website and grab it in that sense.

18:25 KL: There will also be . . . There is a white paper actually posted by this time, it's one of the funny things about being virtual is we're doing this in advance, but all this will be posted by the time that we actually do this presentation. And then I've also mentioned that we're starting our webinar series on 2.0 on December 10th. That will be a high-level overview, a little more depth than this overview, but we'll probably do . . . In Q1, we'll go into even more detail on some of these features for that matter.

So, continue to follow us and look for that. Follow us on Twitter and LinkedIn, we're also . . . We're doing regular blogs. Again, there are more than enough ways, go through the website to monitor the movement of CXL and the excitement coming in the future. So, with that, I will say thank you and we will have a quick Q&A timeframe to answer any questions. So, thanks and have a good FMS.

Dig Deeper on Flash memory and storage

Disaster Recovery
Data Backup
Data Center