00:04 Jean: Okay. Hello everybody, and welcome to our annual Flash Memory Summit Top 10 Things panel. This is usually a fun session in which we all weigh in on what we think was really hot and really notable during the year. Well, flash memory has morphed rapidly from an exciting new technology just a few short years ago that had to be justified for special use cases, to becoming really a standard part of on-prem data centers and also cloud provider infrastructure. In this session, we'll examine the top 10 things that our panel has noticed about flash this year.
So, what do vendors and users alike need to know about flash? Is it the emergence of 3D technology, the rapid rise of high-speed NVMe, the promise of persistent memory, the role of flash and scalable systems, clouds, megawebsites, new methods for flash storage networking, such as NVMe over Fabrics, ways for software to take advantage of flash memory, or large hierarchical storage systems that cover everything from high-speed cache to long-term archiving?
01:13 Jean: So, our top industry experts will present their own candidates for the top 10 list, and here they are. You'll see them on screen, we have in the order in which they appear, Dave Eggleston of In-Cog Systems or Intuitive Cognition Consulting, is the full name. And we have Tom Coughlin on the left there, who is president of Coughlin & Associates. Eric Herzog in the center from IBM Storage. We have Jim Handy from Objective Analysis, and Rob Davis of Nvidia Networking. And they can actually tell a little bit more about what they actually do when they start. Everyone's going to have about two to two and a half minutes to go over their favorite topic in the beginning. And we're going to start with Eric Herzog, who is going to discuss NVMe and the role of software. Eric, take it away.
02:10 Eric Herzog: Right, well, thank you very much, Jean, and IBM really appreciates participating in so much of the Flash Memory Summit. So, my name is Eric Herzog, I'm the CMO and vice president of Global Storage Channels for IBM. I've been in the storage industry for almost 45 years. I've worked at several Fortune 500 storage companies: Maxtor, Seagate and EMC, as well as IBM, and I've also done seven storage startups, and thank God, five of those seven have been acquired.
This article is part of
So today we're going to talk about how things work in flash and some of the new technologies. So, what we saw this year was an incredible rise in the role of NVMe. NVMe provides a performance framework that increase the performance of all flash arrays. There's two aspects of that. First is NVMe inside storage systems, so the storage array controller uses NVMe to talk to the midplane or backplane depending on the design of the storage subsystem and the flash modules are NVMe enabled. We started shipping NVMe and our all flash arrays back in the summer of 2018. One of the myths of NVMe is that it costs more money, and I'm here to say that that's bull hockey.
03:24 EH: When we introduced our first NVMe all flash array, we reduced the price by 30% compared to our previous generation, which did not use NVMe. When we introduced our second generation of NVMe, all flash arrays, which was February this year, our high-end product, the FlashSystem 9200, which replaced our former high-end product, the FlashSystem 9100 was not only faster, 40% faster in IOPS, 40% faster in latency, we also brought it out at 12% cheaper. So, the myth that NVMe costs more money, at least from a storage array perspective, is not true.
The other aspect of NVMe is NVMe in the Fabric. So, we started shipping NVMe over Fibre Channel a year and a half ago, so about six months after our first launch of NVMe inside the storage array. We've done a couple of industry papers jointly with Brocade, one of them is just available, and is our new product, our FlashSystem 9200 which came out in February showing dramatic performance increases in Oracle decision systems, 46%, and a number of other performance metrics in the traditional transactional workloads of block storage.
04:39 EH: So, one, NVMe is not more expensive, depending on your vendor, but is not in point of fact. Also NVMe over Fabrics is now becoming more popular, it was not. By the way, NVMe over Fibre Channel, as many of you are probably more system-centric than software-centric, was just certified by VMware with the V7 launch, which was earlier this year. We were shipping it, again, 18 months ago, and we did have customers shipping it or using it with VMware, it just wasn't "Certified by VMware," but it was working. But now they're certifying, which, since some companies are more conservative now, that VMware, which of course is the largest virtualization platform by far in the enterprise, some of these more conservative customers will do it now because they are, most apps are about 75% to 80% virtualized, so they'll do it. The second thing that's important we see in flash arrays.
05:36 Jean: Keep going, but you only have about 30 seconds.
05:39 EH: I will. The second thing is the importance of storage software. It's not about the hardware. I shouldn't say it's not all about the hardware, it's about the storage software that comes with it. Does it replicate? Does it snap? Does it encrypt? Do you use AI-based tiering so you can tier from one array to another, or from a storage class memory to flash? Or our flash to industry standard flash? And those are things that people need to see and things that are very important to enterprise buyers, is not just the storage system and its parameters, but what about the software and how does the software help me?
06:13 Jean: Alright, well, for the sake of time Eric, let's just call that a wrap for this part, and then we'll be able to come back in the second round. Okay? So, Dave Eggleston, you wanted to talk about persistent memory and AI workloads from your perspective.
06:31 Dave Eggleston: Sure. So, I'll start where Eric left off, which is NVMe is certainly a good breakthrough for the industry. Something new at Flash Memory Summit this year, we did a track, a full track on storage for AI, and NVMe is one of the foundational pieces that tie together . . . You've got different compute elements now. It's no longer just the CPU, you've got CPUs, GPUs, this new thing called the DPU, which I'm sure Rob Davis will want to talk about more, and then also you've got FPGAs or dedicated ASICs, all handling some part of AI workloads.
So how do you tie that all together? So NVMe, whether it's NVMe-oF or NVMe using TCP, or a couple of different ways that we see tying that together. Now, why is that important? So, Jensen at Nvidia talked about it at GTC this year, that in order to manage AI, we have to rethink the whole compute architecture. And as Eric was talking about, that's both hardware and software. So, I'm going to talk just a little bit about this idea on the hardware side, how do you then reduce the problem of the CPU being a traffic cop for data moving between these different processing elements?
07:44 DE: And one of the key things is going to be, how do we move some of that work out to other places? Maybe farther out on the network or into the storage controller itself. So computational storage is an area that SNIA has focused on for quite a while, and that's getting some traction as we move certain tasks in there, whether they're encryption, or doing compression and decompression, or doing part of the database search inside that storage controller itself, unburdening the CPU. Again, the DPU is something . . .
A new term, which is kind of an expansion of the SmartNIC, is also moving intelligence, in this case, out to that network card and doing more of the work there. So I think that's . . . Jean I'll toss it back to you, but I think that's one of the key things as we look at storage for AI, is both software changes and hardware changes, rethinking that whole stack in order to be able to handle this, what I like to call the AI, the beast, we have to feed the beast with lots of data.
08:45 Jean: Dave, you have 30 more seconds to talk about persistent memory.
08:48 DE: I would love to, and I think both Jim and Tom will weigh in on this as well. And let me do that as a jumping-off point also for storage for AI, because one of the most interesting things I've seen recently from Intel was talking about how when you're using storage and there's misaligned blocks or small IO, how do you handle that? And they showed something very new, which is putting in place, it's called DAOS, is a certain object storage engine that they have, and then you take those small and misaligned IO and send it to their persistent memory DIMMs, whereas the big IO, the large blocks would go off to the SSDs. When they did that, they immediately jumped to the top of the HPC ranking for IO500, so that's an example of how a mix of using persistent memory along with storage can be really beneficial to HPC and AI workloads.
09:46 Jean: That's terrific for a quick round and we'll come back to you, of course. I believe, next, we're going to Rob Davis. He's going to speak more about DPUs and GPUs, what they are, how they work, and why they're important. So, Dave, I mean, Rob. [chuckle]
10:02 Rob Davis: Thank you, Jean, and thank you also for inviting me to your panel. So DPUs and GPU direct storage are the two subjects I want to talk about. As Dave mentioned, DPUs were mentioned by our CEO a couple of times now, both at the VMware's keynote with Pat Gelsinger and on the keynote for our conference on the GPU Technology Conference.
10:32 Jean: Your conference is called GTC, right?
10:34 RD: Yes, exactly.
10:35 Jean: Can they still access that online as well or . . .
10:38 RD: Absolutely.
10:38 Jean: I think they can, yeah, okay, good.
10:40 RD: So of course, everyone knows what a CPU is, and for many years, CPUs were really the only programmable element in computers, and more recently, GPUs or graphics processing units have come to the forefront. Originally, they were used to deliver graphics, of course, real-time graphics, but their parallel processing capabilities made them ideal for accelerating computing tasks like artificial intelligence, deep learning, big data analytics. And so now with the CPUs and the GPUs powering these enormous hyperscale data centers, we have a powerful new category of processors called DPUs or data processing units. So, the CPU is for general purpose computing, and the GPU is for accelerated computing, and the DPU is for accelerating secure data movement around the data center between these different CPU and GPU elements.
11:38 Jean: Right, right.
11:39 RD: And I can go into more detail on that in the next part of the . . .
11:42 Jean: Yeah, I think that might make sense. For sure. Well, thanks very much. And next, we'll hear from Tom Coughlin about emerging memory technology and some research he's just completed.
11:55 Tom Coughlin: Well, thank you very much, Jean. And also, it's good to be on the panel with the rest of you. The research you're talking about is actually another fellow on here, Jim Handy and I, a colleague, we did a report, a recent report on "Emerging Memories Find Their Direction," and that direction is going to be filling a lot of these needs.
I think Dave Eggleston sort of implied some of these things, is that, in order to enable . . . And also Eric in some of the systems side he's talking about, is that the emerging memory technologies, whether it's MRAM, resistive RAM, phase-change memory, potentially even things like ferroelectric memory are going to be playing a bigger role. Right now there are resistive RAM devices, phase-change memory RAM devices, Intel Optane is one of those. I think Jim's going to talk more about that. And also, magnetic random-access memory, and we see these increasing. First of all, there're standalone chips that have been used for caching and buffering for many applications.
12:52 TC: Is the fact that Everspin, which is the largest company that's making standalone MRAM products, said they've shipped over 120 million chips. But in addition to these discrete devices, the big wave in these emerging memories are going to be in embedded products, and those are for industrial, for server, and for consumer IoT-type applications. And we see that overall between, particularly between the 3D XPoint and the MRAM that we could be seeing by 2030 a $36 billion market regarding these types of memories, and a big increase in the capacity being shipped for these devices.
13:36 TC: Now, embedded means it's inside of the chip itself, and in particular some of the applications that are going to be driving that are going to be inference engines for AI applications where it's done at the endpoint, so you need waiting functions. There's a company called Ambiq, which chips made by TSMC, one of the largest foundries in the world. They recently announced their Apollo4 chip which has four megabytes of MRAM in it, or to help with AI-type applications.
All the major foundries have talked about making MRAM as well as resistive RAM products. TSMC has also talked about by this year and into next year, products coming out with resistive RAM for other types of embedded applications. So, there's getting to be a whole zoo, if you will, of different types of memories, we're expanding the whole memory market. And what's intriguing about this is that with NOR flash, possibly SRAM reaching their limits in terms of scaling, that we may be moving more into a . . . And even for DRAM applications, moving from volatile memories into non-volatile memory architectures, and these could have an enormous impact on the design of future devices both embedded, and for large systems as well.
15:01 Jean: Wow, that's great. I think we're on time, we might as well go to the next person and then have our general discussion, alright Tom?
15:09 TC: Sure.
15:09 Jean: It's okay? Alright. So, you know this next panelist, Jim Handy, president of Objective Analysis, and Jim is going to talk today about China, Asia-Pacific in flash, and also about 3D XPoint memory. So, take it away Jim.
15:25 Jim Handy: Oh, thanks a lot, Jean, and you've really put together a great panel here, so I look forward to the open discussion.
15:32 Jean: One more thing, can you turn up your volume a little bit?
15:36 JH: Turn it up?
15:37 Jean: Oh, okay. You're not speaking as loud as the others, go ahead.
15:41 JH: Oh, okay. Sorry about that. That's... I'm doing it from my phone, I thought it would be loud enough.
15:46 Jean: Oh, okay. Okay.
15:48 JH: So, I'll just try to boom at you.
15:49 Jean: Okay.
15:50 JH: Yeah. So, there are a lot of interesting things going on. I come at this from a little bit of a different perspective than some of the people, with my very strong chip orientation, and Tom did mention that he and I worked on a report on emerging memory technologies that's . . . I was doing chip stuff for that, but with 3D XPoint, then I'm also looking at it, from chips. I'd like to say that I look at SSDs and storage from the inside out. It's an interesting technology because Intel has put this together, XPoint, in order to give themselves a platform cell that outdoes its competition. Intel, I think, came to the idea a few years ago that if they were able to make a faster storage layer that sat on the DRAM bus, that they'd be able to provide a whole lot better performance in certain applications. And Eric mentioned things like Oracle and that kind of stuff, those, the database applications have done extremely well by harnessing SSDs, and now they're poised to do much better by harnessing what can be done with persistent memory, which I think Dave Eggleston talked about a little bit too.
17:14 JH: This is really a very compelling story, and Intel have started about a year ago to roll it out with one of their Xeon processors, it's tightly coupled with that. And although, if recent divestiture of Intel's NAND flash business showed us that their SSD business for Optane SSDs is not all that large. I've been expecting since the onset that the DIMM business would actually become very large, raising up to over $10 billion a year within a few years of its introduction. We're still on track to see that, it's going a little bit more slowly than anything ever does, you know, you see when it's a new bet.
One of the things that Objective Analysis, my company does is issue reports, and we've got a report on 3D XPoint memory, Intel's Optane, and we'll be updating that soon, but it has a very well-thought through explanation as to where the market's going and how it's going to get there.
18:18 Jean: And some sizing, right, market sizing? Am I right?
18:21 JH: Yeah.
18:21 Jean: Yeah.
18:21 JH: Yeah, market sizing. Something else that we do that's kind of a little bit offbeat for the market research, semiconductor market research firm, is we actually estimate what profits are. And so, we've been looking at 3D XPoint, and Intel has plowed a ton of money into this which is something that's going to . . . Sorry about that. Going to get pull ahead of other memory technologies, emerging memory technologies, and that's their. . . The fact that they want this badly enough that they've invested to the tune of a couple of billion dollars a year for the past three years.
18:56 Jean: Right, I don't mean to cut you short there, but we wanted to back up just a little bit and say how flash, which is what the subject of our entire conference is, really you and I talked, virtually all of it, it's being made in Asia and this plant you're talking about, it's in China, we have other plants in Japan that supply . . . I know SanDisk or now Western Digital, and also the new company that . . . It's Kioxia or something.
19:26 JH: Yeah, oh, Kioxia, yeah.
19:27 Jean: Right, right, so . . .
19:28 JH: And I'm probably not pronouncing that right, yeah.
19:31 Jean: My question to you yesterday was, where in the U.S. do they make flash? And I think your response was in one place or something, yeah.
19:38 JH: Yeah, yeah, Micron's got a plant that is used to make chips for the U.S. government in Manassas, Virginia.
19:44 Jean: And that's pretty much . . .
19:45 JH: Other than that, the NAND flash is pretty much made outside of the U.S.
19:48 Jean: Right, that's just . . .
19:49 JH: And this might be a good time for me to talk about the fact that the Chinese government wants to get into the NAND flash business.
19:56 Jean: Oh, it is?
19:56 JH: And they've got this company, YMTC who's been a keynoter for the Flash Memory Summit, over the past years, and there're a couple of other companies, CXMT is a DRAM firm that is having some success based on technology from Qimonda, and then there's another, JHICC, which is struggling to get to where it can actually ship a product, but all of these are efforts by Chinese companies, the JHICC and YMTC are very government-oriented, CXMT is kind of going on its own path. But anyway, it's an effort to get into the memory business, which then will hopefully, for the Chinese government, get the Chinese electronics industry a little bit more self-sufficient in that area.
20:47 Jean: Alright, well, amazingly and perhaps frighteningly, we were pretty much at the lightning round because we have five people. So, let's do a lightning round and let's go in the same order in which we began. So, Eric, did you have some other things that you want to point out as we do our lightning round, just with a quick minute to a minute and half.
21:09 EH: Well, a lot of discussion on storage class memory. The key thing we're seeing so far are two deployments at the system level. One is as a cache to the array controller, and the other is as a standard piece of memory, just like flash, hard disk or tape. At IBM, we use it as a standard piece of memory. Our performance at our array level is the best in latency and IOPS in the industry, and that's documented not just by us but by others, so we don't use it and don't need it as a cache. That said, as fast as it is, it is very small in capacity compared to flash, it's also exceedingly expensive. So just the way flash started, we bought flash, we'd all call them the speed demons. They needed the fastest, they didn't care what it cost, and in fact, before flash, there were some companies that would package up DRAM and sell it as an external system.
22:09 EH: So, there was a couple of them, one in the Bay Area, Texas Memory, which eventually went into flash, was one of those players, and there was a few others as well. So we see storage class memory in that niche, the speed demons want it, other people want over time, storage class memory will come down or will get bigger, which is exactly what happened with flash, and now flash is ubiquitous. It is the number one selling for the analysts that cover the numbers at the system level, the number one all flash or the number one array type in the world, are all flash arrays, and the cost is basically . . . It's basically killed the 15,000 RPM array business and the bulk of the hybrid arrays because you don't need it if the cost come down. So, storage class memory will get there, but right now it's really for the speed demon crew, we do sell some of it, but compared to Flash, it's miniscule at this time.
22:56 Jean: Okay, so let's go next to Dave, I think.
23:00 DE: Thanks. Let me build on that. So, Eric was talking about the use of Optane or 3D XPoint technology in SSDs, and as Tom, Jim and I, and we sat on a very interesting panel many years ago, right? When 3D XPoint was announced, and I do remember saying at that time that the real purpose of it was to use it as main memory. We do see with the launch of the Optane DIMMs that now it is being used as main memory. Several speakers at FMS are going to talk about the acceleration that they get. And it's really two different areas that we see this new use of persistent memory, one is in in-memory database, in-memory compute, in-memory database, really acceleration of SAP HANA. That's a place where we see it adding capacity at lower cost than DRAM.
23:48 DE: A second place we see it being used, is quite recent, is just number of VMs per server, so again, that's adding more memory than you can get out of DRAM at a lower cost than DRAM. The third one which I've come across very recently was a calculation was done on The Next Platform, an article there, they were looking at if you take an HPC system, an existing HPC system, which might take about 50 watts in the memory system, if you extend that, just doubling the number of cores and how much memory you'd need, if you use DRAM, that'll be 700 watts just for the memory subsystem, that's too much. So, another advantage I think we're going to see in the future for these new persistent memories is being used as main memory, but also to bring down the power consumption, because persistent memory consumes much less power than DRAM. So those are the three applications where I think persistent memory as main memory are driving ahead.
24:45 Jean: Well, thanks very much. Sorry, it's moving so quickly here. So, I think Rob, you're next, but we do have to keep it kind of brief.
24:54 RD: Okay. Well, I want to talk about applications where memory isn't enough, and that's with GPUs because with artificial intelligence, often the amount of data required to come up with your algorithms and to start your artificial intelligence system is more than can even be put into a system on the storage embedded, so that's why we've come up with a new approach to moving data in and out of GPUs called GPUDirect Storage. And what it does is bypass the whole CPU, the CPU that the GPUs are hosted by, bypass its memory completely and move data directly from the network using RDMA technology into the GPU's memory, and that is an order of magnitude increase in the performance. This product is in beta now, and there's many different partners that we're working with, pretty much all of the major players and a lot of startups. It requires RDMA, so for block interface it's NVMe over Fabrics, and for files it's NFS over RDMA.
26:10 Jean: Okay. Well, we have to kind of move on to wrap up, but we will take follow-up questions from chat and in-person, so thank you very much. Tom, if you have some quick remarks, you can make them here.
26:24 TC: Yes, I would like to make some remarks, building on what other people have said is that, the . . . First of all, I believe it was Eric that was saying that as you get . . . Make more of this stuff it gets cheaper, that's a fundamental tenet of semiconductor fabs as my friend Jim would point out as well. And one interesting thing on this, on the enterprise side, I think in the last few weeks I've probably been briefed by, on 12 different products that are using Optane. What's interesting is, it's the SSD NVMe version, as well as the DIMMs that they're talking about there, so it really is taking hold.
27:00 TC: And there's something that Dave said that, which is the power savings. It's significant, it points out with DRAM. Also it gives you, for instance if you're in a battery-powered device, an IoT device, or a wearable device of some kind, and you have a SRAM in there, for example, if you turn the power off, the data goes away and you have to repopulate it again. That's not the case with the MRAM, for example. And these are definitely the things, these little niches, new special applications, as they build up, as they increase in volume, it brings the cost down in manufacturing, it lets people bring the manufacturing of MRAM from the back end perhaps closer into the actual manufacturing of the devices. So, I think it's going to be a big game-changer.
27:46 Jean: Okay. And very quickly, going on to . . . Thank you very much. Going on to Jim. Jim we'll have to just wrap it up very quickly. Go ahead.
27:55 JH: Yeah. And I'm going to agree with Tom and with Dave, and bring in a little bit of what Eric said is that, in the past, we've looked at computing systems as having really two elements, and that's the cost and the performance, and now everybody's talking about the energy, and with these new technologies you're not only able to get better cost performance because of the fact that you are bringing the storage closer to the processor or communicating as Mellanox says go with the . . . At a higher speed, but you also are able to get the energy down because of the fact that you're just simply using less computing resources to be able to accomplish the same task. So, all three of those together, the performance, the cost, and the power are really a big deal.
28:49 Jean: Right. I hate to call an end to this, but it's supposed to be a short session. Let me just say thank you to everyone, and also I'm offering, and this is probably a big mistake, I'm offering for everyone to send me any follow-up emails after our session when we have live Q&A, they can send any additional questions to my own email, oh no, [email protected], and I'll distribute those to the people that you want to hear more from, okay? And thank you very much to everyone, really appreciate it.
29:26 EH: Thank you, Jean.
29:27 Jean: Okay. Bye-bye, guys. Thanks so much.
29:29 DE: Thank you.
29:30 Jean: Okay.