Tech Accelerator Flash Memory Summit 2020
DNN Accelerator for the High-Performing Space Computing Program Using Hardware Acceleration to Increase NVMe Storage Performance
Guest Post

Storage as the Driver of Change: Rethinking Data Infrastructure

This keynote presentation from Western Digital's President Siva Sivaram explores the advancements in semiconductors, flash technologies and data architectures.

00:14 Dr. Siva Sivaram: Hello all. It is wonderful to be back at the Flash Memory Summit talking to you all about the advancements in semiconductors, in flash technologies and in data architectures.

People have been talking a lot about the profusion of data, the amount of data that is flowing in from the edges, from the endpoints up into the cloud. The big battleground that is in front of us is how do we get the maximum value out of this data? We at Western Digital are particular that we give you an ability, an infrastructure to store and access the data in a secure fashion. We're going to be talking to several key Western Digital executives on what they are doing with respect to this ability to take the data, curate the data, and be able to deliver it in a fashion that you can get the maximum value out of this data.

01:15 DS: Conventional thinking is that compute dominates the data architecture. We've been seeing this over time as Moore's Law slows down in both compute and in DRAM. Purpose-built architectures are showing up everywhere. Purpose-built for different workloads, whether it is for search, whether it is for video, whether it is for database acceleration. Many, many such workloads require additional specific semiconductor solutions in the compute space. That's because overall, you can increase the number of cores, but you are not able to shrink the transistor much further rapidly.

02:02 DS: Similarly, in DRAM, the density of DRAM is not growing up. But there is one place where Moore's Law is alive, well and kicking, where we have not hit the Moore's wall, and that's in flash. In flash, that's happening because we have these two new additional ways of scaling. We are not just scaling in one direction. We are not just scaling in the Y direction, we are also scaling in the Z direction. X, Y, and Z scaling allows for rapid scaling of flash as we saw with the 3D NAND revolution that happened in the earlier part of this decade.

What goes in association with that is logical scaling. The ability to get from two to three to four, now we are talking about 5 bits per cell from that same die. That allows for this rapid scaling of flash. We have to keep in mind that this amount of data is useful if it is stored. Stored literally means at the end of it a bit has to flip somewhere. We at Western Digital feel morally responsible for being able to provide you with that ability to store and secure and access that data.

03:30 DS: So when we have X, Y, and Z scaling and Moore's Law continues to work for 3D NAND, and we are able to provide you with this 256 gigabytes, 512 gigabytes, 1 terabyte, 2 terabytes die, and we are able to provide systems, solid-state drives, small die for edge applications, large die for cloud applications, these are happening, what happens next?

What happens when these are put into systems? What is needed to make this data be accessible, for instance? How do we marry an operating system to the characteristics of the media, what we call Zone Name Storages, ZNS? What happens when we can build controllers with open source architectures such as RISC-V? What are these controllers going to be able to provide in terms of ECC, or in terms of the front- and back-end host controllers? What can we do with security?

We know that security threats against data are proliferating. What do we need to have, a root of trust established around the flash? What does it take to network the storage? What does it take to put it in a NVMe over Fabrics?

04:54 DS: What does it take to make sure that there is accelerators, AI accelerators that can seamlessly sit on that same fabric? So, that we can now, instead of moving all the data back and forth through the processor, can now do it in the appliance itself, where some command from the host, make sure that the data and its processing happens in the storage appliance? What happens with video? Smart edge video sensors to make sure that we do local inferences. How do we make sure that edge intelligence makes this happens?

All of these and many more such as these are important developments in and around the core strength of storage. Western Digital revolutionizes and takes the core storage strength and then applies to all of these additional applications that go around the data. We are going to be talking to several experts today on each of these applications. I'm looking forward to these conversations.


06:02 DS: Let me introduce to you Dr. Yan Li. She is the vice president of silicon technology and manufacturing. She has over 200 patents and is a distinguished member of the IEEE. Yan, you and I have worked together now for the better part of a decade. 3D NAND and its scaling have been pushing Moore's Law further and further along. What are the new things that are happening in flash that are keeping it scaling so very well?

06:38 Dr. Yan Li: So, 3D NAND is indeed scaling very well. We have a two aspect of scaling: physical scaling as well as logical scaling. Let me talk about the physical scaling first.

The physical scaling, we are in a very good position to avoid the lithography limitation. So, we build the 3D NAND so that can continue scale according to Moore's Law. And we're not only growing the layers from 48 layers to above 100 layers in production now. We are going to go to 200 layers pretty soon. We are also going to shrink the X-Y dimension to pack in more memory cells in each layer, so this actually multiply the scaling effect. Not only that we're also hiding the peripheral circuits underneath and above the array to get the full benefit of the scaling.

07:29 DL: On the logical scaling side, we have the TLC, 3D TLC, 3 bits per cell is already replacing the 2D NAND MLC, completely in lots of product. This is because the 3D NAND is a very good cell, a very big cell. And then now we are going for QLC, the 4 bits per cell. The 4 bits per cell will go to mainstream SSD pretty soon because of the four-plane architecture we are starting with sixth generation. So, people already work on the 5 bits per cell, we have PLC cord. And we have a system side -- architectural wise -- a Zoned Namespace, so called. And also helping out the QLC into production into a lots of SSD product. With all that, our scaling basically propel all these larger capacity and in the small footprint, is a very good scaling path.

08:26 DS: Now, we think of storage as a static thing. You think of it as a fixed, "Hey, this is what a block storage device is going to be." But you now are making it very dynamic. What's happening with this level of active, dynamic scaling and active dynamic characteristics of 3D NAND?

08:49 DL: So, the key is that we have wide applications, from edge device to all the way to data centers. For example, the edge device needs a low power, high performance, but data center needs a high capacity and high endurance. So, we have a very wide applications, we design one NAND chip, but we design so many different variations to tradeoff power, energy, endurance together. So, you know we also, we scaling the I/O speed. Our I/O speed goes from 400 megabytes to 2 gigabytes, five-fold in four generations. So, all these innovations actually propels, drive the innovation in the NAND. NAND is never static, it's everything -- have lots of things going on every year.

09:43 DS: I also learn from you more and more that we're adding a lot more intelligence to the NAND device. What do you mean by that? I mean, I thought it was just read and write and erase. That's all we used to do with NAND device. What's this intelligence doing?

10:01 DL: So, you know, the big data grows by the second, dramatically, every second we generate so much data. The data analysis suffers from the convention of von Neumann computer architecture nowadays. The conventional architecture basically brings all the data from storage and moves to the central location of the CPU, GPU to get processed.

But this moving, it takes lots of energy, and the latency takes time. So, this is something we have to break for the big data problem. Big data requires a new solution. So, we want to move the compute as close to the storage as possible, even inside the NAND. So, as we are scaling the NAND, we also improve the CMOS. There is a potential in the future we actually build the intelligence inside the NAND. So, NAND not only can store the data, we can also compute, encrypt for security, and ECC for data integrity, even adding AI functions. So, this could be a game-changer in the future and completely changing the architecture and the functionality of the NAND.

11:11 DS: So, you're telling me, this simple block device, the flash memory is scaling on X, Y, Z, and logic. It is very dynamic in that you're designing for the edge. You are designing for the core with characteristics of high endurance or and low power, while at the same time figuring out how to add more logic, more intelligence into the CMOS circuitry under or above.

11:37 DL: Absolutely, we can do that in the future.

11:40 DS: Thank you.

11:41 DL: Thank you.


11:43 DS: It's my pleasure to introduce to you Dr. Richard New. He is the vice president of research at Western Digital. He was one of the inventors of the shingled magnetic recording architecture, and now on to ZNS, the Zoned Namespace architecture.

Richard, we've been talking about computational storage, the idea of bringing compute ever closer to storage. What's new in the field of computational storage? What is exciting about it? Why should we care about it?

12:17 Dr. Richard New: Thanks, Siva. So, that's a great question. So, I think it's probably fair to say that one of the significant trends in compute during the last decade has been this idea of moving compute from a central processing unit out to some other device that can do the compute more efficiently. So, for example, think back to graphics accelerator card, later GPGPU, accelerating machine learning functions. You know other types of inference and machine learning accelerators, so these are all DPUs. These are all ideas of, you know, you have a specific kind of compute that instead of doing it on a CPU, you can do it more effectively in a custom architecture.

So, computational storage extends that idea. And the idea there is to take compute function that would be in the CPU and move it down to a storage device. And if you think about it a little bit, you can figure out what kinds of applications or compute functions might make sense in that context.

13:14 DN: So, if you think about, for example, a box of SSDs connected to other servers through a network. You can imagine that the aggregate bandwidth of all of the NAND die and all of the SSDs is much greater than the network connection. And so you could propose, for example, that there might be some compute functions that you could do that require very high throughput to the storage device and maybe not that complex, the compute function, that you could execute on that platform very effectively.

So, a classic example would be a pattern match function. So, if you want to scan through all the data just looking for a particular pattern, that's something that is computationally very easy but requires very high throughput to the storage device. So, that would be sort of a candidate application. Maybe for databases, you would think of row filtering or column filtering. So, these are all ideas that potentially have application in the computational storage space.

14:13 DN: The other aspect of these types of computation, it's really good if you can, for a pattern match function, if the amount of data that you need to return back to the server is much less than the amount of data that you would need to scan. So, if you're trying to find a particular pattern in a large pool of data, you can scan through and you only have to return the match locations, for example.

So, this gives you some idea of the types of compute functions, a column like the basic compute elements or primitives that you could use for computational storage. And then the problem is to find a larger, a more general-purpose application, let's say, a database, you want to find an application where a significant amount of the compute, let's say, in a general-purpose query is, is really executing these primitives.

14:58 DS: This appears to be as big a software problem as it is a hardware accelerator. What is Western Digital doing about the software ecosystem that needs to go with this?

15:11 DN: We have a significant investment in Linux open source software in general, and Linux kernel in particular. And one of the things that we have learned is if you want to work on storage architecture, you need to have a minimum, a very good understanding of how things work on the whole software side.

And I think back to earlier in my career, one of the eye-openers for me, I remember, this is way back on hard disk drives, we wanted to change the block size from 512 bytes to 4 kilobytes. And, of course, this was a no-brainer if you were a hard disk drive guy -- you make the ECC more efficient, you gain 10% or more areal density very, very easily. But, of course, it took a decade or more, probably more than a decade, to get that accomplished in the industry and some people would probably say it's still not done. So, this gives you some idea that if you want to innovate on the storage architecture side, you need to be able to innovate also on the whole software side.

16:11 DN: And so, more recently, you mentioned Zoned Storage, so we've applied that principle there. So, Zoned Storage is an umbrella architecture or like a common architecture for SSDs and HDDs. On the HDD side, it's for shingled magnetic recording or SMR HDDs. On the SSD side, it's for Zoned Namespace SSDs or ZNS SSDs. And those two types of media share a common set of constraints. When you write to these media, you want to write sequentially within a zone.

So, we have worked recently and, I believe, led the industry in terms of defining this Zoned Storage architecture and also creating the software ecosystem to support that. And since you mentioned computational storage earlier, computational storage will be even more like that -- that's even more of a problem where the problem statement itself really comes from the application side. It's easy to make a device that performs a specific hardware function, but you really have to be able to support that in the software. So, that's a tremendous area of focus for us.

17:13 DS: I'm going to switch topics on you. Another important area in storage is security. How does Western Digital think about security? What are we doing about it?

17:25 DN: One thing we can say about security is that any data storage company today or any data processing company, whether they like it or not, they are a security company. So, I think that's commonly understood in the industry. And as far as security itself is concerned, let me focus on two different domains. One domain is the area of, I'll call it security building blocks. So, we're talking about cryptographic algorithms, security protocols, the industry-accepted procedures and policies for certification, etcetera. So, in this area, these security building blocks, our general view is that this is an area where you don't get a lot of points for style, or innovation really. I mean, this is a place for innovation, but if you innovate the next cryptographic algorithm, it takes 10 years for the industry to vet it. So, when you're talking about making a security product, the strategy there is to use these tried-and-true cryptographic algorithms and policies and procedures and protocols, etcetera. And our belief in that space is that the industry will move to open and inspectable architectures.

18:36 DN: So, this will take a while, but this is one of the reasons why we were co-founders of the OpenTitan project, along with Google and lowRISC and Nuvoton and G+D and others. And the idea here is that the most secure solutions will be built around open and inspectable architectures, where people can go in and examine the implementation and find problems, etcetera, and it's through that open vetting process that you end up with the best solutions. So, in the space of these security building blocks, that's the way we think the industry is going to head towards these open architectures.

19:14 DN: There's another area of security, of course, which is taking these building blocks and making something for your customer, providing a feature or a service, and of course, in that space, there's a lot more room for innovation. And so, I think fundamentally the message for security is that the differentiator, if you like, in security space is really more about the company culture, how you think about security. Do you take it seriously? Is it a burden? How do you handle that within your company? So, it's a matter of having the correct culture within the company, and also supported by policies and procedures to really create a strong security environment. So, that's the way that we think about it.

19:55 DS: Thank you. Thank you, Richard. Appreciate it.


19:57 DS: I have the pleasure of talking with Ihab Hamadi. Ihab is a fellow of Western Digital. Ihab's past experience has been across the board in the semiconductor space, whether it is in networking, whether it is in compute or storage. So, let me take that question from there

Ihab, we have been proudly talking about the fact that we continue to scale in storage, but it is not quite as fast a scaling in both compute and in networking. What's the implications to overall architectures? What is this leading to in the industry?

20:39 Ihab Hamadi: Yeah, it's interesting, Siva. It's been a fascinating journey with Moore's Law, with the semiconductor industry at large. As we look at the CPU within a server today, the single-thread performance has been mostly stagnant for the past number of years, while everything around it has either been getting larger as in the case of storage of all kinds; getting faster, as in the case of networks; getting even more responsive, as in the case of lower latency interconnects, lower latency flash and so on and so forth. And moreover, the amount of data that the world continues to generate is continuing to go up.

The problems that we're trying to solve continues to actually go up in complexity, and the type of insights that we're trying to extract also continue to go up in complexity. So, the question is, "How do we drive the performance needed to get us to the next level of requirements that are presenting themselves?"

21:44 IH: Within the data center specifically, there are broadly three different areas that will help us get to the level of performance, actually drive performance through the next decades. And the three areas are, one, distributed compute being embraced broadly in all of its forms. And this is an exciting area. It has been around for a long time for things like HPC, but now it's making its way through the data center, to the edge, the micro-edge, and all the way to the end-user devices. Computational storage is a very interesting technology that is really at the heart of it, it's a form of distributed compute.

The second area is really the ability for us to match the workloads, the various workloads of today with a suitable set of infrastructure and this is where things like composable disaggregated infrastructure or composable infrastructure come in.

22:45 IH: The third area really is our ability to drive better performing applications and removing inefficiencies across the stack really on the server. So, whether this is starting with the operating systems, kernels, drivers, libraries, all the way to the applications. If you take SSDs, for example, they present traditionally a well abstracted and familiar block interface. And this block interface is great for end-user computing, laptops, even personal workstations and so on. But as you get into the data center, the scale really starts to matter a lot, and all these inefficiencies begin to really add up. So, the ability for us to remove duplications and reduce inefficiencies begin to yield improvements around things like durability, things like better endurance, and things like even higher performance that's consistent, whether you're doing reads and writes, for example, at the same time.

So, Zoned Storage, for example, is one technology we're very excited about that delivers all of these advancements. And we're excited to be announcing a new data center-class drive that is compatible with ZNS, which is now all part of NVMe -- it's all open standards.

24:15 DS: Very, very interesting. Now, you also mentioned going from the core cloud to the edge to the endpoint, to the end-user devices. Data architectures across this spectrum, what's new, what's happening from the core to the edge?

24:34 IH: It's very interesting the amount of change that the industry at-large is going through, Siva. Let's take video gaming as an example, and it's a very exciting time for gaming, especially as we look at things like cloud-hosted games.

Today, you can instantly gain access to the latest titles in cloud gaming from your mobile device. So, now, if we look at what's really enabling this experience, starting with the data center core, what you really see over there is a collection of console boards, in some cases there are all within what looks like a specialty server, it could be a 2U server housing as many as eight of these consoles. And right above that, you'll see a set of switches that are providing virtualization capabilities. Right on top of that, you'll see VMs or containers that are talking to these switches that are actually also connected to cloud storage, and this is where all the games are stored. We're seeing that being mostly flash due to performance reason and scalability reasons, and so on. Along with that, you also see control plane functionality there with the ability to spin things up and down to match the workloads and to match the needs of the specific games that you're playing.

26:00 IH: So, what we're seeing there in terms of requirements for storage is more NVMe, again, due to its mostly performance reasons. We're seeing a lot of advanced telemetry that enables continuous tuning for the storage. We're seeing a variety of form factors from U.2 to E1.S, E1.L and so on.

And as we make our way out to the edge, and this is where, as a gamer, this is the first point of connection that you'll get. The first thing that you see is a CDN or a content delivery network, and this is where you get a lot of the data cached, specifically the static data and the data that doesn't require rendering, for example, and so on. This is a great area for some of the higher density flash like QLC and, in the future, maybe 5 bits per cell and beyond.

26:53 IH: And what we're seeing over there as well is, certain requirements around low power, a lot of these edge locations are very power-constrained. We're seeing requirements around performance with regards to consistency. You have a smaller overall number of storage units, so the ability to spread the load across many units becomes limited and, therefore, consistently being able to deliver performance becomes even more critical. We see a lot of dense form factors from EDSFF, E1.L and so on, and we also see a variety of environmental requirements. Temperatures ranging from minus 40o to plus 80 o Celsius are really commonplace in the edge in that case.

27:43 DS: So, we have covered a vast ground. You started with Moore's Law, scaling, and limitations and compute desegregation, and we have gone from the core to the edge architectures. We came back in and we talked about what's happening in the cloud itself with respect to the data architectures, and then we went ahead and talked about the networking part of it, with NVMe over Fabrics.

Pulling all this together is the centrality of data, the proliferation of data, and the fact that we are able to store -- and storage becomes the big part of the data center. Data center, data center footprint, data center capacity is driven by storage. The capital spending of large cloud data centers is dominated by how much they need to spend on storage, so this is what we call storage at the center of this data revolution.

28:46 DS: So, we heard from experts within Western Digital on the centrality of storage, the scaling of flash, how that is enabling this vast amount of data to be stored, the operating systems such as Zoned Storage, the security and the security operations that need to be done to maintain this data, and then all of the evolution in the architectures both at the core and at the edge to accommodate this volume of data.

This is an exciting period in storage industry. Western Digital is doing its part to take our moral responsibility to ensure that we are providing you with the solutions that you need to access and create value out of data.

Dig Deeper on Flash memory and storage

Disaster Recovery
Data Backup
Data Center
Sustainability and ESG