Tech Accelerator Flash Memory Summit 2020
Storage as the Driver of Change: Rethinking Data Infrastructure The Evolution of Data Centers in the Data-Centric Era
Guest Post

Using Hardware Acceleration to Increase NVMe Storage Performance

Learn how Marvell is seeing the industry leverage hardware acceleration to bolster NVMe application performance from their VP of marketing of their flash business unit.

Download the presentation: Using hardware acceleration for NVMe application performance

00:13 Thad Omura: Hello there. My name is Thad Omura, and I'm the VP of marketing for Marvell's Flash Business unit. Unfortunately, with COVID pandemic, we all can't be together at the Santa Clara Convention Center for Flash Memory Summit this year, but here at Marvell, which is just less than a mile away, we've been able to re-create the stage setting to give you a familiar keynote experience. So, wherever you are in the world, I do hope you and your loved ones are safe, and I'm thrilled to have the privilege to spend the next 25 minutes with you.

This is my 13th year in a row of being a part of Flash Memory Summit. Over that time, I've led flash silicon and solutions marketing for companies such as SandForce, LSI, Seagate and ScaleFlux. The last one we're focused on the emerging field of computational storage; I'm now proud to lead the marketing efforts for Marvell's Flash Business unit.

I'd like to share with you today how we see the industry is taking advantage of hardware acceleration to increase NVMe application performance. So, for those of you who are not up to speed with Marvell, we are focused on semiconductor solutions that move, store, process and secure the world's data faster and more reliably than anybody else.

01:48 TO: Earlier this year, we celebrated our 25th birthday. Over that timeframe, we have evolved our product offering, and now we have a very strong focus on data infrastructure. Our mission is supported with the industry's most complete semiconductor portfolio focused on data infrastructure across processors, networking, security, and storage, which is what I'm going to focus on today. We have a true leadership position. We target the highest growth infrastructure markets, including the data center, which extends out into the cloud; the carrier markets, which involves all of the rapid deployment of 5G technology; and the automotive market, which is quickly integrating more and more intelligence and storage.

Now, while storage has always been a part of Marvell's DNA, it's important to note that our leadership in these other product segments is what really enables us to approach storage solutions from a total infrastructure perspective. We all see how the lines are blurring between compute, networking and storage. As a result, we believe it's critical that products are defined with a holistic infrastructural-level approach. This means we simply do not strive to optimize performance in one area, only to expose limitations in another.

03:33 TO: Our goal is to create products that optimize the entire infrastructure to deliver the best application-level performance. As the premier supplier of flash storage silicon, we provide state-of-the-art NVMe solutions to the industry. Now, because this market is getting so diverse, we provide our customers with both merchant and custom ASICs, and then we match that with very deep firmware capabilities and expertise. We work in partnership with all the NAND vendors from media support and, in fact, many of those NAND vendors today use our controller technology for their own flash products. We have support infrastructure to provide a very highly customized and differentiated solution.

In terms of products, we are the market leader in SSD controllers. Because of our scale, we are able to address both data center and client applications. We also provide very critical compute acceleration solutions in the data center, and we also convert networking to storage protocols in disaggregated storage applications. I'd like to share with you how these hardware accelerators and converters have become critical infrastructure components as more and more NVMe flash is deployed in the data center.

05:26 TO: Look, it's no secret that NVMe flash has been, and has over the last number of years been the go-to interface. Our analyst forensic, Gartner's, have reported that the majority of servers today use NVMe over legacy SAS and SATA. Now, at the same time, the ramp for NVMe and storage is early, but it's growing. This transition to NVMe is happening because of a reduced software stack, and with that reduction in the overhead on the CPU, we are able to enable applications to meet their low-latency service-level requirements. There are petabyte-scale workloads that need to be ingested quickly, and we can now do that orders of magnitude faster than with legacy interfaces.

Now that the industry is well on its way to consolidating on NVMe flash storage, it is important to take a step back and really look at the infrastructure that is required to optimize the usage of all of this NVMe flash storage that has been or will be deployed. What we see on the compute server side is that the main driver of growth is really the need to significantly improve the efficiency of NVMe flash storage. And on the storage side, for as long as NVMe over Fabrics has been talked about, there are still infrastructure and scalability limitations and challenges that are blocking the full potential of disaggregated NVMe flash storage. Let's first dive into NVMe server efficiency.

07:33 TO: We all knew that as soon as flash storage performance started to be based on PCI Express, that we were all in for a wild ride. We easily see today that a 25x performance jump from SATA to PCIe Gen 5 that will again even double when we move to Gen 6 is already upon us. We also see a massive capacity increase as 3D NAND continues to increase in die density and the NAND vendors race to continue to drive the cost of SSDs down.

Now, this is all goodness. But when you think back to the principles of Amdahl's law, all of this additional storage performance and capacity cannot be utilized efficiently if the other parts of the system don't keep up. These huge gains in flash storage performance have not been balanced with the same server CPU gains and, as a result, all of this NVMe flash that has been deployed is not being efficiently utilized in the server. This is exactly why hardware accelerators for NVMe data path are so important to rebalance the compute to storage ratio, so application performance using NVMe SSDs can be optimized. Marvell is focused on a flow-through NVMe accelerator architecture to offload the host CPU -- NVMe data goes in, NVMe comes out.

09:15 TO: Now, accelerators must be easy to use, and that's why we support inbox driver compatibility. That means if you use your standard Linux or Windows or VMware NVMe driver, it works with our accelerators, just like you're plugging in the SSD directly into the CPU. We focused on an innovative DRAM-less, low-power design to make it easy to build very small form factor solutions around these accelerators. Our accelerators are customizable to fit into any cloud or system OEM management infrastructure. Based upon our customer feedback, both RAID data protection and storage I/O virtualization are functions that significantly slow down the host CPU and impact application-level performance the most, and that's where we have focused our energy.

10:20 TO: So, speaking of data protection, HPE recently announced a RAID 1 boot solution that uses our hardware acceleration technology, and they're using this to completely offload the host processor from all RAID processing. HPE offers us today as part of their ProLiant server platform, and they also offer it with other server families. Now, the first thing to note is because of our low-power chip design, we fit right onto a PCIe card that has room to plug in two full-size M.2 SSDs. Using this boot SSD solution completely protects and isolates data from corrupting critical log files, corrupting recovery files, and the reason why we're able to do that is we completely isolate it and make sure it's not part of the main user data RAID stripe in the system.

11:25 TO: This is absolutely critical, and hyper-converged and virtualized infrastructure where there are simultaneous multiple tenants running applications on that same server hardware, one virtual machine crash can potentially corrupt or impact all of the other tenants.

Now, by using a separate hardware accelerator for RAID boot and recovery files, you can now maximize your storage slots in the server for all user data, and by leveraging inbox drivers and completely offloading the CPU, this solution provides tremendous value to free up the host CPU to support either more tenants or optimize application performance -- the exact thing we're all trying to get at.

We also see hardware accelerators play a key role in storage I/O virtualization, especially in cloud and hyperscale environments. Traditionally, when virtual machines want access to storage, they must communicate with the hypervisor that's running on the CPU to get its permission. But with more and more virtual machines wanting access, the hypervisor processing overhead becomes a bottleneck, which significantly slows down application performance. But in a hardware-accelerated architecture, the hypervisor function can actually be offloaded to a hardware accelerator device. This enables virtual machines to access storage as if they're directly connected to the SSD with all of the same infrastructure access and permissions that you would expect a hypervisor to control.

13:29 TO: Now, by completely offloading the host CPU from this function, true NVMe performance with low latency can be realized in multi-tenant environments. We are working with Tier-1 hyperscale vendors to deploy this technology later this year, and they have shared with us how important this accelerator technology is to meeting their total cost of ownership for their large-scale infrastructure.

We do see that this function of storage I/O virtualization is migrating into the SSD controllers, and we're actually doing that in our own SSD controller offering, but several customers like the accelerator model, and the reason why they like it is because it provides a consistent storage I/O virtualization solution, regardless of which SSD solution or vendor you use behind it.

OK, let's shift gears a little bit. I'd like to focus now on NVMe over Fabrics and disaggregated storage, scalability. We all know that when you start grouping NVMe SSDs together in a disaggregated storage system, you are consolidating on the order of 100 gigabytes per second of storage performance. Of course, you'd like to open access to all of the storage, so the box is naturally connected to the Ethernet network. We see this disaggregated storage model is typically built with JBOF architectures or just a bunch of flash storage boxes.

15:27 TO: The issue is in these JBOFs, they typically have 100 gigabit per second NIC card, which is really only on the order of magnitude of about tens of gigabytes per second of performance. So, when these JBOFs, the NIC, the CPU and the memory complex are responsible to receive NVMe over Fabrics data coming in from the network and then translating it to NVMe on the drive, but the problem is this architecture is essentially choking access to all of that NVMe performance with the NIC and the CPU sub-system. There has to be a better way.

This scalability limitation is why we have driven to market a dedicated hardware device that converts NVMe over Fabrics to NVMe at the drive level. With this device, you can see how networking and storage are converging in a very big way, and only a company like Marvell -- with deep understanding of both technologies -- can make it happen. Essentially, what we've done here is we've eliminated the need for the CPU, the DRAM and the NIC in the JBOF, and we've distributed the NVMe over Fabrics to NVMe processing to dedicated hardware devices. As a fabric transport, we support RoCE v2; we also support TCP/IP, which will be coming next year.

17:10 TO: The device itself supports dual-25 gigabit per second Ethernet ports on the network side, and on the storage side, we support a single PCIe 4.0 interface to the SSD. Of course, we've made the device low power so it can fit in a very small footprint and in small designs.

We are delighted to work with Kioxia, an industry-leading SSD provider to utilize this technology and enable industry's first native NVMe-oF Ethernet SSD. This device features dual-25 gigabit per second Ethernet ports that can directly connect to the Ethernet fabric. So, the future is here, the future is now, shout out to Alvaro Toledo and the Kioxia team for their strong partnership on NVMe over Fabrics and leadership in NVMe SSDs.

18:16 TO: Let's now compare how an NVMe over Fabrics JBOF compares to an EBOF or something that we call Ethernet Bunch of Flash. The JBOF bottlenecks all of that NVMe flash performance behind a NIC and a CPU. This architecture has single points of failure, and when you want to scale, you have to add another box and connect it all the way up to the main Ethernet switch fabric. But by putting Ethernet SSDs behind an Ethernet switch, you completely unlock all of that NVMe performance and you simply scale performance with Ethernet when you connect into the box with whatever port speeds that you'd like

Our converter supports end-to-end multipath, which eliminate single points of failure throughout the entire switch fabric, and it's very simple to scale storage performance and capacity by simply daisy chaining additional EBOFs to each other. Because Marvell's an industry leader in both Ethernet switching technology and flash storage, we were able to see the value of consolidating network and storage to enable this very innovative architecture.

19:49 TO: And while last year we talked about EBOF solutions coming to market, this year we are super-excited to share there are two leading providers of EBOF storage solutions: Ingrasys, which is part of Foxconn and Accton -- both have released EBOF platforms to the market. These platforms feature dual-Marvell Ethernet switches, each switch has 6 100 gigabit per second links attached, and the switch fabrics are built within the EBOF to have multipath to 24 Ethernet SSDs directly or you can connect NVMe SSDs through an interposer card.

We utilize our highly scalable Prestera Ethernet switch family, which are integrated directly into the EBOF. These Ethernet switches can be selected to be right-sized for the desired number of ports and performance, as Marvell today has available switches that scale up to 12.8 terabits per second. Marvell has also invested an EBOF software development kit based upon the widely popular open source SONiC switch management suite. We have built-in a number of critical NVMe over storage-related features, including drive recognition, congestion management and quality of service. We're happy to provide this SDK as part of the platform and enable OEMs to further customize their own EBOF solutions.

21:34 TO: At the end of the day, the EBOF represents a storage solution that has the highest performance density for NVMe over Fabrics disaggregated storage. To add to the excitement is the fact that end users like Los Alamos National Labs are seeing the true system-level benefits of the EBOF architecture. Now, they've released scalability benchmarks that show multiple hosts when connected to a single EBOF solution with Optane SSDs are really able to scale the performance much better than a standard JBOF with 1 or 200 gigabit Ethernet ports connected to the EBOF -- the throughput ramps, but then you see is eventually limited by the Ethernet port throughput, just as you would expect.

Now, keep in mind, this is the max throughput that you would expect out of a JBOF with a dual-port, 100 gigabit per second NIC, but as you scale the EBOF to four ports and even six ports, you see that you can actually triple the throughput in the same box. And this performance is generated after you have actually removed the NIC, the CPU and the DRAM, so you're saving a tremendous amount of cost.

23:10 TO: I'd like to take a moment to thank Dominic Manno and the Los Alamos team for working directly with us on validating the EBOF value proposition and letting us share their data. We actually look forward to applying this technology to more HPC applications that we know will benefit from the scalability provided by this EBOF architecture. I'd also like to share with you some very interesting results from our friends at Micron.

Now, one of the most storage I/O-intensive applications that exist in the world today is in the field of AI and machine learning, and the reason is because GPUs can ingest so much I/O from storage at a very high rate that it's really important to keep them fed if you want to reduce the runtime of these petabyte-scale training workloads. But the main bottleneck in this particular training workload is typically getting access to high-throughput storage I/O. So, what Micron did is they utilized Nvidia's latest DGX A100 state-of-the-art GPU platform.

24:36 TO: And they connected six 100 gigabit per second Ethernet ports to the EBOF storage system that's loaded with 24 Micron NVMe SSDs through the Marvell converter. Now, traditionally, all of the GPU I/O that happens to go out to storage must first go through the host CPU that sits in the DGX box before the NVMe traffic can flow out onto the Ethernet ports. In order to get around this bottleneck, Nvidia has introduced an optimized driver solution that they call GPUDirect Storage, where the GPU actually bypasses the host CPU and the data flows directly from the GPU out onto the NVMe over Fabric storage -- this improves performance and reduces latency.

So, when comparing the usage of these two driver models, we see that EBOF nicely continues to scale the performance as the number of worker threads starts to ramp up. In fact, we're able to see a 4x improvement in storage throughput with a 72% reduction in latency, and while these gains are a very big testament to the updated GPU to NVMe over Fabric storage driver from Nvidia, it does demonstrate the fast ramp and demand for highly scalable NVMe over Fabric storage. And, we're looking forward to seeing additional application-level benchmarks enabled by EBOF in this very exciting AI and machine learning space where there is a huge demand for storage throughput.

26:37 TO: I'd like to take a moment to thank Currie Munce and the benchmarking team at Micron for their support and partnership and sharing this GPU-related benchmark data with us. Micron will share more details about this benchmark and others using EBOF as a storage target in their Wednesday session on NVMe over Fabrics.

As you can see, we've made a tremendous amount of progress with Ethernet SSDs in the last year, we are seeing the first wave of deployments will be driven by end users who simply cannot scale their NVMe over Fabric storage performance using traditional JBOF architectures. In the field of HPC and AI and machine learning, as well as other applications that we see emerging, like database analytics and scalable cloud applications, we are seeing increasing demand for improved storage scalability. These applications prioritize the need for full throughput to disaggregated flash storage. Ethernet SSDs and EBOF provide a very simple solution to solve this need.

28:05 TO: We see the industry is now focusing on the need to further develop storage management solutions to ease the deployment of this technology in additional data center environments. For many applications, EBOF takes the cloud approach at using software-defined storage solutions to scale-out orchestration, reliability and redundancy. The SNIA Networking Storage group is actively working on standards for this network storage management and welcomes your participation

Now, we also see that like our accelerator products, as native NVMe over Fabrics Ethernet SSDs go mainstream, the converter technology itself will eventually be integrated directly into the SSD controller itself at the drive level. We see dual-interface SSD controllers that can handle both NVMe and NVMe over Fabrics. What's really critical to make this all happen is the ability to gain access to more aggressive silicon process nodes, because we need to make sure that the price points and the power can fit into SSD form factors. This is where Marvell is in an ideal position to embrace this market trend. Now, we would like to partner with the industry and drive further proliferation of NVMe over Fabrics Ethernet SSDs and EBOF solutions to market. So, please do reach out if you have interest to get involved in this very exciting technology.

30:08 TO: OK, I'd like to now share with you how we see the market is evolving for NVMe from an SSD controller perspective. Now, as the market leader, we see one of the primary markets that has shifted fully to NVMe is the client laptop notebook space. We see that the performance needs for these platforms has plateaued, and the real focus now is on battery life and low power of these laptops. Marvell has responded with industry's first 12 nanometer PCIe Gen 4 DRAM-less SSD controller that is optimized for this market. We continue to deploy large volumes of SSDs into the data center market and see demand for more features to support improved efficiency and scalability of NVMe. Because we enable this infrastructure with accelerators and converters that I've talked to you about today, we are in the best position to integrate these value-add features and functions into the SSD controller itself as these functions go mainstream.

31:27 TO: In the data center market, we are seeing that the volume of our first DIY cloud vendor is starting to ramp and will happen through next year. Now, what is DIY? Marvell has put together a highly valuable do-it-yourself model where customers can basically customize how they build and consume SSDs for their specific application and environment, from the controller to firmware, all the way to getting multiple support from many different NAND vendors. This is a key, key feature because it helps in sourcing flexibility. We support customization along the entire process. It takes a tremendous amount of support and resources to make this all happen, and that's why we're so excited to see that this model delivers true value to the industry.

But we have also seen this year the emergence of NVMe's expansion into two new growth markets. The first segment is what we generically call the edge, where there is a tremendous amount of application-level customization that's required. We are delivering today to our first DIY edge customer a highly customized controller and firmware solution in volume, and this application has a whole new set of performance and power requirements. And we are also seeing tremendous excitement for NVMe in the auto market where autonomous driving, sophisticated in-vehicle entertainment systems and storage consolidation is driving the need for SSDs. This market targets a whole set of requirements specifically around reliability. We'll have more to say about this in future updates.

33:34 TO: As more and more markets and opportunities are created for NVMe, having the appropriate scale, IP and access to advanced silicon nodes is going to be the key differentiator. And this is where Marvell is extremely well positioned.

It has been an absolute pleasure to share with you the progress that Marvell has made over the last year. We have seen that NVMe accelerators are being deployed into infrastructure to offload the host CPU and enable optimized application-level performance. We see converters are the key to unlocking NVMe over Fabrics, disaggregated storage, scalability, and immediately see demand in the field of HPC, AI and machine learning and many other applications. We are excited to see the emergence of new high-growth markets this year for NVMe SSDs and look forward to leading in those segments as well, and I'd like to leave you with the message that for any of your NVMe SSD controller or infrastructure silicon needs, please reach out to Marvell. We are very open to partnering with the industry, as you probably have heard me talk about throughout this entire presentation. Our worldwide corporate leadership, access to advanced silicon process nodes and established operations and support infrastructure is what sets Marvell apart in this high-growth flash storage, silicon market.

35:41 TO: Please stop by our booth, our virtual booth, it's in the FMS Expo area by clicking on the Marvell logo located on the show floor wall. We have exciting products and technology demos and updates you certainly are not going to want to miss out on. I also want to let you know this presentation will be followed up by a Q&A session, so please do stand by and hang on the line for that. Thank you.

Dig Deeper on Flash memory and storage

Disaster Recovery
Data Backup
Data Center