Download this presentation: Hands-On Testing of Persistent Memory's Effects on Analytics Performance
00:03 Dennis Martin: Welcome everyone, this is the 2020 Flash Memory Virtual Summit. My name is Dennis Martin, I'm the senior analyst with Principled Technologies, and this is session D-9, hands-on testing of persistent memory's effects on analytics.
I'll just . . . Let me just mention a couple of things about myself and Principled Technologies. I've been in the industry a very long time and we are, at Principled Technologies, an independent test lab. We have servers, networking and storage, we've got many racks of gear and engineers who can test those things. We are also a full-service marketing agency. We have our own studio, our own camera equipment and all those sort of things, so we can do videos and that sort of thing. So, we can do a full service of reports and videos and technology stuff, but we like to actually get our hands on things and actually test them. So, that's what we do at Principled Technologies.
00:57 DM: I should mention also that I've spoken here at the Flash Memory Summit for several years now, usually talking about some aspect of performance, real-world performance using real applications and real servers and networking and storage and so forth. But because of this year being a virtual event and all the things going on this year, we're doing this as a virtual event. So, I'm doing this from my home office, so you get a little glimpse into my home office there in the upper-right-hand corner, but mostly what you want to pay attention to is what's on the screen. I do plan to attend the event, the virtual event, live while it's actually happening, although I'm recording this session in advance, but I'm looking forward to interacting with you all as the opportunity arises.
01:41 DM: All right, so let's get into the actual presentation here. Let me talk about our agenda. We're first going to talk about persistent memory. What is it? Where does it fit? How does it work? All of that basic stuff that we need to know about persistent memory, why it's different and so forth. Then we'll get into Intel Optane persistent memory and its operations. We'll talk a little bit about, specifically, the Intel Optane implementation and some of the things you need to know there, and then we'll get into our recently published lab test results where we actually ran it in a real-world environment and we compared it to a system that doesn't have Optane, so I'll get into those details a little bit later.
I should mention also that these slides do include embedded hyperlinks. I've put these there for your reference, so when you download the presentation, you can click on some of those things that are in a slightly different color and you'll get more information, usually technical resources about whatever that particular topic is.
02:35 DM: All right, so let's get going here. First of all, what is persistent memory? So, persistent memory is computer memory that is non-volatile. So, unlike the standard computer memories that we use these days, such as static RAM or dynamic RAM, persistent memory doesn't lose its data when you turn off the power. SRAM and DRAM do -- when you turn them off, everything goes away, but with persistent memory, the data still stays there. So, when you power it off, power it back on, all that data is still there just the way you left it. So, one question is "Have we seen this before?" And the answer is yes, some of the early computer memory technologies such as drum memory, magnetic cores and some others. Does anybody remember bubble memory? These were all non-volatile, that is, you put the data there and it stayed there even when you turned the machine off.
03:25 DM: So, the question then is, "Well, why did the computer RAM change? Why do we have volatile memory now, when we used to have non-volatile memory?" The basic reason is that the investments in the semiconductor industry, starting in about the 1970s, resulted in the volatile memory technologies that were actually lower in costs, cheaper and they were also higher in capacity density than the existing non-volatile.
And I'll mention this again later, but I'll just say it here now, there's lots of interesting things that come out of the semiconductor industry as we advance from one generation of technology to the next, and there's all kinds of cool technology things, but as it turns out, cost carries a heavier weight than all the rest of those other things. If you can't do it, if you can't make it cheaper, then it's probably not going to be a long-lasting technology. All other things being equal, you want it to be cheaper. Now, of course, in this case, you also get cheaper and higher in capacity, which is even better. So, we'll see that more in a little bit when I get into some other things, but cost is very important in the semiconductor industry.
04:33 DM: So, let me talk just a little bit about my own early history with this topic. When I first started in software development back in the '70s, yes, I've been doing this for a long time, I learned a couple of basics, and one of them were there were instructions or code, and then there was data. And, so, you had your program, you had your script, whatever you had, and then it operated on some data, but I also learned that I had to distinguish between two types of data. There was the data that you could throw away, it was temporary data, such as scratch space or intermediate calculations or whatever it was you were doing. You could put that in memory because it would go away or it could go away, whereas there was also this thing known as permanent data that you kept on something called storage, and this is data you wanted to keep. This might be the results of your calculation or maybe data you're going to use in a graph or something or maybe a database, you're getting names and addresses or whatever it is you're putting in your database, the stuff you want to keep. So, you have memory, which is temporary, and you have storage, which is permanent.
05:40 DM: Another thing I learned about was this thing called the memory and storage hierarchy, this is something that there's a whole . . . In fact, I'm going to show a pyramid a little bit later talking about some of this, but basically you have fast and expensive versus slow and cheap, so that when it comes . . . When you think about memory or storage, you have those options, and as it turns out, there are multiple layers here, but basically that's the way that works, and then the other concept in that, along those same lines, is this whole concept of online versus offline. So, if it's online, you can access it immediately, there's no manual intervention, nobody has to go mount something like a tape or a USB drive or whatever versus this thing known as offline, where the data is on something that a computer can use, but it's not online, accessible immediately -- somebody has to go do something manually to get it such as, as I mentioned, a tape or a USB drive or whatever. Of course, USB drives were invented relatively more recently than some of the other technologies, the offline technologies.
06:41 DM: So, when I was thinking about this, I said, "Well, why can't we just have data and let the computer just put it in the right place and do the right thing with it, either because of some policy or some knowledge about the data, something where you don't have to . . .
06:54 DM: Keep it . . . Keep track of, is this temporary or is this permanent? Now, of course, over the years, everybody sort of just knows how to do this from a developer standpoint, you know, "OK, there's memory and there's storage." But it's like, well, why couldn't we just have one type of this data? So, I was attending a SNIA meeting, a Storage Networking Industry Association meeting in Colorado Springs a few years ago, for something completely unrelated to this topic, and during one of the breaks I was talking about this concept with a guy whom some of you know named Jim Pappas. And I explained all of this, so this was in my own personal history, he goes, "Oh, you understand it, you get it, why we need persistent memory and what problems it solves." And so, that's exactly it. So, this is . . . We're moving ahead here and we're doing something really that's different or at least different recently that we haven't done before, or at least not as well developed as we had it before. And I'll get into some of that a little bit later here.
07:53 DM: So, one of the things you have to think about is, how do you address this data? It turns out there are two ways to address data: You can either use something known as byte addressing or you can just use something known as block addressing. So, software accesses memory in bytes, and so this would be called byte addressable. It turns out CPU cache lines often get memory in chunks of 64 bytes at a time, and these are often called load and store operations.
On the other hand, software that accesses storage, gets it in blocks, and these are large groups of bytes, typically you'll have something either like 512 bytes, or 4,096, known as 4K, and these are called the read and write operations. And NAND flash memory is block addressable, not byte addressable. Hard disk drives are block addressable, not byte addressable. Whereas regular memory, DRAM and SRAM, are byte addressable. So, it turns out that persistent memory can do both. Persistent memory has both block addressability and byte addressability. So, if it has both, which one should we use? And the answer is, well, as it turns out, we need to use both.
09:08 DM: So, keep in mind, so that's one of the differences, persistent memory can be block addressable or byte addressable, whereas volatile memory is only byte addressable and storage is only block addressable, so now we've got something that can do both.
So, here's that pyramid I wanted to talk about. This is called the online memory storage pyramid, and this image comes courtesy of SNIA, you can go out to their website and look for the . . . What is persistent memory? And you'll see the slide here, at least this image. So, as it turns out, there are different types of memory and storage that meet these different latency and capacity requirements that we have. And you can see in this pyramid here, we've got several layers, and at the top, in the orange, you'll see CPU and the key thing we're looking at here is latency. Latency is basically the round-trip time it takes to get there and back again. So, we're going to refer to these numbers in nanoseconds, but I'm also going to refer to the stuff on the right that talks about in human observable terms, and then you can see these other technologies listed there -- and I added a couple of things here, I just wanted to say.
10:16 DM: So, if you look at it from the top working down, you'll see that each layer is faster, smaller and more expensive than the layer below it. So, just as an example, the memory that's in the CPU, these are the level 1 cache, I mean, well, you've got the registers certainly, but you've got the level 1 cache and so forth. Level 2, level 3, these tend to be very short latency measured in single-digit nanoseconds most of the time, and so, that's very, very fast. And as we move down the chart here, you're going to see 10 to different powers, and we'll talk about those. And as I move down the chart, then we'll jump over and look at the human observable terms as well. And then as you work your way down, you'll see you have DRAM, you have NVDIMMs, you have performance SSDs, capacity SSDs, and HDDs or hard disk drives. And so, if you're looking at this from the bottom going up, then you would notice that each layer is slower, larger and cheaper than the layer above it.
11:16 DM: Now, getting back to that cost point that I mentioned before. If somebody were to invent a technology that didn't follow this cost guideline, where I'm saying either looking at it from the top or coming at it from the bottom, where it couldn't also fit into that statement there, then if the cost was much higher or if it was slower than one that was below it, it doesn't fit in this chart, then it probably isn't going to sell. So, that's just the way it works out from a cost standpoint and from a technology standpoint. So, it's got to meet the latency numbers, and it's got to meet the cost criteria, and of course, it also has to meet the capacity criteria.
So, let's kind of work our way down the chart here. So, below CPU, we have DRAM, and you can see 10 to the first latency, so that's a two-digit nanosecond response time, so that's pretty good. And we've got NVDIMM listed there, NVDIMM-N, and I'll explain NVDIMM, what those letters mean on the next slide. Work your way down here, we've got this NVDIMM-P, I'm going to come back and say a little bit more about that in a second.
12:22 DM: Then we have performance SSDs, these are your enterprise-class, much faster SSDs, and there's the latency there, 104 nanoseconds, so now we're considerably larger. And then capacity SSDs, 105 and then hard drives, 106. And because I also worked at a tape library manufacturer many years ago, I would also want to put tape on this list but, of course, offline storage would be tape and that would be below HDD in this hierarchy.
Now, let me explain what's going on on the right side in human observable terms. It's very hard as a human to really understand what . . . The difference between 1 nanosecond and a two-digit nanosecond time. I mean, that's just so fast that it's really incomprehensible from a human perspective. So, the folks at SNIA, when they put this chart together, they put these human observable terms on there, which is very handy. So, let's pretend instead of the latencies being 100 nanoseconds, let's just say we're talking about human observable terms. So, the time it takes to send an email these days is pretty quick, you can send it and receive it typically within a few seconds.
13:32 DM: Depending on where it's going and all that, but it's pretty quick. So, let's use that as that top layer, that orange layer there of CPU and DRAM. Let's just say in a human-equivalent terms, to get something would be only a few seconds to send an email and even somebody when they can open it and receive it and they could read it within just a few seconds. And then if they're sitting right there, they could type in and send back and you would get that response also fairly quickly.
Now, I'm going to skip the purple layer for just a moment and go to the performance SSD layer. Now, this would be . . . Suppose you're talking about sending an email across country. So, we're going at least in the United States, let's say we're going New York to LA and back again. So, you want to send it and then get a notification that it was delivered. That's the round-trip time that would happen in seconds if it were at this top level, whereas the performance SSDs, this would be like getting on a plane, going there, meeting somebody, and then jumping right back on the plane and coming right back.
14:31 DM: So, now we're talking hours, and that would be the equivalent of these performance SSDs, these are the higher speed, the enterprise class, typically NVMe, but that's where we are there. If you want to move down to the capacity SSD, now you're talking about days it would take you to drive across country. If you drove straight through from, again from New York to LA and back again . . . However, you can carry more with you in your car than you can typically on a plane, the carry-on luggage and all that sort of thing. And then the hard drive, again, looking at in human terms would be weeks. How long would it take you to walk across country and come back? So, that's just to give you an idea what the scale means as far as these powers of 10 and the nanoseconds and so forth.
But let's talk about this purple layer here. This is called persistent memory and it fills a gap between the 101 nanosecond range that we have with some of this DRAM and other memory technology versus 104 nanoseconds of latency that we have with the highest or the fastest performance SSDs. So, we've got this layer in the middle, and there's a gap here. So, for many years there's been this gap here where you jump from memory to storage -- that's a big deal.
15:44 DM: It turns out persistent memory fits very nicely in that gap and so it fills that not only from a nanosecond and response time perspective, but it also fills it from a capacity perspective and from a cost perspective, at least that's the idea here. So, the cost fits, the capacity fits and the latency fits, so that's why this purple layer fits right in there. So, I've put that little arrow on there just to point to the purple layer. All right, so that's where we are with this whole storage pyramid. We've got persistent memory fitting nicely in between DRAM and performance SSDs.
Yeah, I mentioned that we've got a couple terms here I want to explain. So, we have . . . Notice on this chart, in the orange, we have something called NVDIMM-N, and then we also have something called in the purple we have NVDIMM-P. So, let's talk about what those are. It turns out there are three different types of NVDIMMs and all of these definitions I took from the SNIA dictionary and condense it a little bit, but you can go find it there and there's the first example of a link you can click on at the bottom of the screen there.
16:50 DM: So, let's talk about NVDIMM-N, NVDIMM-F and then NVDIMM-P. So NVDIMM-N is a DIMM that operates as persistent RAM, is byte or block addressable, and doesn't necessarily provide the same performance as DRAM. So, it's a memory device that is persistent. Now, how it becomes persistent and how that works, I'm going to talk about a little bit later. But just the concept of it, it's a persistent RAM, you can do byte or block addressable, and it may not be quite as fast as DRAM. It could be depending on what you're doing, but generally speaking, it's a little bit slower than DRAM, but close enough. So, you can put it in that same layer as you saw on the previous chart.
All right, NVDIMM-F is, it's a DIMM form factor; again we're talking about memory form factors here, but it's using a block protocol or a block access method.
17:47 DM: It appears as a separate address space from DRAM and may also provide a different performance level. So, this now is in memory, but it's definitely something different than the regular memory. So, the system would see it and say, "Oh, that's a different kind of memory." We're going to get into more of that in a little bit.
However, the one we really care about is this NVDIMM-P, and this is the one that operates in both the N mode and in the F mode. So, it can operate as a persistent DRAM with byte addressable or with byte and block as it turns out, or you can operate it as this NVDIMM-F, which is block access but still not volatile. So, we've got the best of both here with NVDIMM-P and that's where we're going to spend a little bit of our . . . That's where . . . That's the kind of devices we're talking about for this presentation is the NVDIMM-P sort of class of devices.
18:43 DM: All right, so let's talk about what the interface is for these things. So, these work on the memory bus, so today we have the parallel memory bus, DDR4 and DDR5 this right around the corner, and so that's the primary way, at least today, that you do this persistent memory. You just put it right on the memory bus just like regular DRAM. And those two, those two are the players right now. However, there are some other ways that you can do it, or will soon be able to do it, and there's a bunch of other standards out here that are in various stages of development that are trying to say, "Well, every bus works great, but there are some limitations there." We want to be able to get beyond what just the memory bus can do.
So, we have these things called near-memory serial connections and we have this other topic as far memory like a fabric. So, let's talk about the near-memory serial connections. These are serial connections that get memory that's nearby, that is in the same system or at least in the same chassis, and there are several standards being worked on here. There's C6, that's that first one there, C-C-I-X, they pronounce it C6. You got CXL and you've got OpenCAPI and there's some other ones. But those are the three that are . . . That you'll find the most about, and again, if you click on those links, it'll take you right to their web pages.
20:01 DM: And, so, that's a serial connection that's faster than the memory bus, the current memory bus of today, but it allows you to get to more memory, but in a slightly different method. Now, this far memory is more of a fabric, and this is for getting memory that's in other systems or that's not in this current . . . The same node as the CPU, it's somewhere else in another device perhaps or maybe in another part of the system that's . . . It's completely different. And so, you'll see the Gen-Z Consortium is the main proponent of that sort of technology. As it turns out, and I'm sure you've already seen, if you've been attending this presentation, you'll hear some things about Gen-Z, and as it turns out, the CXL, the two groups have figured out that they're actually complementary, and so, they're doing some things mutually, so there's a mutual collaboration announcement that was already done, and I think you'll see more about it here in this Flash Memory Summit, you'll hear more news about that sort of thing.
20:56 DM: Now, all of these newer ones here, as far as I know, all of them, all four of those near . . . The near-memory and the far-memory ones use or require PCIe Gen 5, which is not yet available in production systems. In fact, only . . . We're only just now getting Gen 4 by some systems and we've only had that a relatively short amount of time. Pretty much everything else is PCIe Gen 3, so we're talking about something in the future. It turns out these systems also will be using DDR5 most likely.
Now, just to point this out, that the DDR5 does not require a PCIe Gen 5 and vice versa. They don't require each other, but because of the timing of all these things, both technologies may begin to appear in production systems about the same time, probably at least a year away, probably 2022, maybe late 2021. And let me just talk about dates here for a minute. So, this is . . . First of all, I don't have any inside knowledge about where these companies are with their development, the processor guys, the memory guys, and all that other stuff, I don't have any inside news, I'm just going by what's available publicly.
22:10 DM: However, this is what seems to be out there, what they've been talking about. Now, some companies might say late 2021, others might say more likely 2022. So, let's just look at that for a little bit. I've been around long enough to know that when a company puts a date out for some new technology, in this case, there are multiple new technologies, I'm thinking about a system that you might get with these technologies in them. The marketing guys want to push it as early . . . Make it as early as possible, whereas the engineering guys have to say, "Yeah, we've got to fix this, we got to get this done, we got to get this done, there's a lot done." And having worked both in marketing and in engineering, I understand that tension. So, companies will say, "Well, let's just say somebody said that they're going to have it in late 2021," so my question would be, "Does that mean it's just sampling, you're putting samples out?" Or does that mean you're in some sort of beta test stage by that time?
23:11 DM: I was a beta test manager at a company several years ago, so I know what that means. Does that mean it's sort of an early release program, a limited release program, or does this mean it's generally available -- anybody can just order stuff and you'll ship it right away or pretty quickly? I look at this from an end-user standpoint that says, "Oh, when can I order this and expect a reasonably quick delivery of it?" I want to order a production system that has this, either a production server, or a desktop or whatever is appropriate. So, that's the way I look at these things. So, that's why I'm thinking probably later, maybe 2022, we'll see all this. You know, maybe some of it will be in 2021, who knows, but that's my comment on dates, but that's where we are. So, we'll probably see these technologies that use PCIe 5 about the same time that DDR5 comes out.
In another presentation I've done, I actually compare DDR3 and DDR4, and PCIe Gen 2 and Gen 3, and . . . Most of the time even those didn't line up, but in this case, these might line up. It might not, but that's kind of where we are there.
24:17 DM: All right, so let's talk about what do you need to do to your system to make it work with persistent memory. So, first of all, the motherboards, the BIOS, the UEFI, need to understand that there's now two different types of memory. Up until very recently, when we started first seeing persistent memory, up until then, you just had DRAM, and that's all you had to think about. There was memory and that's all there was. You know, sometimes there might be a slight variation in the speed or the capacity, or those kind of things, there was a transition from DDR2 to DDR3, and then to DDR4, but it was still just memory, and you typically only had one type of memory, you wouldn't mix . . . You wouldn't expect to see DDR3 and DDR4 memory in the same system at the same time, that just doesn't work. The slots are different, and they do that on purpose because the voltages are different and all kinds of things are different. So, you only had one type of memory in these systems.
25:12 DM: Now, we need to understand that there's two different types of memory, and these memory have two different types of purposes, again, we have the volatile and the non-volatile. So, the motherboards and the BIOS need to . . . all to understand that. That's the first thing that has to change. The second thing that has to change is the operating systems and hypervisors need to handle persistent memory, both as memory and as storage. They have to know that there's two different kinds of things, two different kinds of memory in the system, and they can behave differently as far as the way the user wants to use them. So, if it's volatile, you want to just run it as regular old memory and that's fine. But this persistent memory can go both ways -- it can be memory or storage -- so you have to understand that as an operating system and a hypervisor. Certainly, the addition of persistent memory has non-uniform memory access, or NUMA, architecture implications because, clearly, it's not uniform. But persistent memory is definitely different than a volatile memory. So, those are definitely different.
26:14 DM: Well, here's something that's interesting, persistent memory. Because it's a different kind of a thing, it requires hardware and software support for what we call non-deterministic memory access times. And let me explain what I mean by that. DRAM is fairly consistent in its access time. It's got a pretty small window of variation of the latency inside of accessing DRAM, whereas some of these persistent memories, the window is wider, it's a little bit slower than DRAM, but it's also got a wider tolerance for how long it will take. And sometimes you get these outliers that take a little bit longer just because of the physics and the way it works, so you can't always expect that if you don't get your answer back within a certain period of time that then there must be an error. Well, no, it's just going to take a little longer, so you have to have wider tolerances for this latency, and that's why it's non-deterministic because you can't always predict what that latency is going to be.
27:10 DM: I mean, it'll still be reasonable. It'll still be faster than storage, than traditional storage, but it's going to be a little different. The other thing to think about for these persistent memories is, unlike DRAM, many of these persistent memory technologies have something that we all know called the finite write endurance or write cycles and so forth. Technically, DRAM is not infinite, but it's so high that it might as well be infinite. It's effectively infinite, whereas these other technologies have this finite write endurance, and I think most of us are familiar with the NAND flash write endurance or familiar with write cycles and all that.
These persistent memories also have something similar. Now, most of them are actually higher than NAND flash, which is a good thing, but it's still finite, and so you at least have to be aware of write endurance. Some of them, maybe not as much as others, but you still need to be aware of that. So, that's what you have to do when you have persistent memory in a system, your system needs to know that, "Oh, this might wear out. So, what do I do when that happens?" If it's inside the system and normally you can't just hot swap memory DIMMs, I mean, that just doesn't . . . It doesn't work, which goes back to why we need some of these other kinds of ways of accessing memory. All right, so the other thing that we need is some new programming models to fully exploit this.
28:29 DM: As it turns out, the SNIA, Storage Networking Industry Association, has come up with this . . . There's a whole bunch of different technical work groups at SNIA, one of them is called the SNIA Persistent Memory Programming Technical Work Group or TWG. And they've developed something called the NVM programming model, which describes how systems should work with persistent memory. It doesn't specify the type of memory, it's just saying, "What are the operations you have to do and how should that work regardless of what the underlying physics of the technology are? Just how does it work?" It's not tied to a particular operating system or operating environment, it's just, what are the steps you should take to program this knowing that you're . . . Sometimes you might want it to act as memory and sometimes you might want it to act as storage. Now, there's a whole bunch of stuff out there. These guys have been working at this for a while, so that's out there and you can go find that, but there are some things that haven't been completely worked out yet, so what about the idea of remote persistent memory?
29:25 DM: For example, if I have persistent memory in this server and I want to access the persistent memory in another server, and I'm considering it from a storage standpoint, how do I get there, and how do I get back, and what's the procedure for that? Because it's still a memory, but it's sort of also storage. So, there are some things that still need to be figured out there.
Another one here, what about uncorrectable memories in memory where there's no equivalent to RAID for storage? So, what we mean by that is, we know about memory that has ECC, error correction code, and it can correct errors up to a certain point, but you only get . . . What if you go beyond that? What if you have multiple errors, how do you overcome that? Now, in storage systems like hard drives and SSDs and so forth, you can do things like RAID 5 or RAID 6, but you can also do RAID 10, I mean, there's all kinds of things you can do there. With memory, you typically haven't thought about doing that. You might mirror it, some systems allow you to mirror the memory, so you pay for, let's say, a terabyte, but you only have half a terabyte usable because one is simply a mirror of the other.
30:28 DM: Well, that's RAID 1 for storage. What about all those other RAID levels? And, so, are there things you can do there? And, so, people are working on those kind of things. So, that's what's available there.
All right, so now, let's talk a little bit about the NVDIMM, and I mentioned, I'd say, I'd give you a little bit more information about it. So, let's talk about what an NVDIMM is, NVDIMM and what it is. So, typically, these devices -- and they're available from multiple companies -- they include DRAM and NAND flash on the same DIMM, but it's presented as one type of one thing, and it's one unit. And so what you get is you get the DRAM, but you got NAND flash acting as a backup behind it, so the system only sees the DRAM, the operating system and the motherboard, they only see the DRAM, but it's declared to be persistent because you've got NAND flash behind it, and you've got some method of flushing the DRAM into the NAND flash very quickly if the power goes out, you have some kind of super cap, or battery, or something. So, that's what NVDIMMs are.
31:35 DM: And then you can, in some cases, interleave them, and by interleaving them, this gives you something like a RAID 0 stripe, for example, so you can stripe it across some number of NVDIMMs rather than just putting all the data in one of them. And so this allows you to stripe it, as I said, and typically it's an even number, so two NVDIMMs or four NVDIMMs are the usual cases, almost any even number will work, depending on how much your motherboard supports and all that sort of thing. You can stripe it across there, so you can write data cross there, which means you can get faster access because now you're putting some of it on a different NVDIMM and you can run them in parallel, so that speeds things up a little bit. So, that's NVDIMMs and that's what you can do there.
32:21 DM: All right, so let's talk about where we are today with persistent memory support. So, you can, in systems today, get systems that support NVDIMM-N, and some of them also support Intel Optane persistent memory. And so, again, there's a link there so you can get more information about those there.
So, let me talk about... First of all, what are the persistent memory-aware operating systems and hypervisors? So, for Linux, Red Hat Enterprise Linux, also known as RHEL or R-H-E-L, depending on how you like to say it, it supports and it is persistent memory-aware, the latest versions are, and I'll also mention, because I've got these in parentheses here, anything that's binary compatible with Red Hat, which would be CentOS and Oracle Linux, those also support or they have persistent memory awareness, so they can do that. SLES, or SUSE Linux Enterprise Server, that one also supports persistent memory and Ubuntu supports it and again, these are fairly recent versions, you can't go too old, this is fairly recent, but they have been added. They're out there, it's supported and all that sort of thing.
33:31 DM: VMware recently announced support for persistent memory as well, and Windows supports persistent memory. So, we've got a good basic set of environments that can support this now. Some of them have just been a few months ago, some of them a year or two, but we're there. We're in pretty good shape from which OSes you can run or which hypervisors you can run, that's important, persistent memory and are aware of it and know what to do with it.
Now, the second thing you need to have is a persistent memory-aware file system and that's because this persistent memory can operate both as memory and as storage. You have to have some way of writing to it and file systems, at least in a block mode, would do that. So, in Linux, you've got three that I'm aware of. Btrfs, which is a little bit more experimental; but then ext4, which is pretty well known; and XFS, which is well known. And on the Windows side, NTFS is your file system that supports persistent memory. So, that's where we are today. So, looking at this, you can kind of say, "Yeah, we've got enough of an environment here, enough of an ecosystem here that we can actually start doing things with this." So, that's the story with support for persistent memory.
34:43 DM: Now, let's dive into a little bit more about the Intel Optane persistent memory, specifically how it works and how you use it. So, there are three modes of operating and the third mode is just a combination of the first two. So, let's just talk about the first two. The first one is called Memory Mode and Memory Mode just says, "I've got DRAM acting as a cache in front of this persistent memory, typically a lot more memory, a lot more than you have for DRAM." So, you present this large amount of persistent memory to the operating system and to the hardware, but you also have some DRAM in front of it, so DRAM caches. So, you get the speed of the DRAM, but you have, really, a much larger amount of DRAM, and so, you have this very large memory. So, it just looks like very large memory. Typically, persistent memory come in DIMMs that are at least four times the capacity of DRAM, sometimes even more than that. So, you've got a very nice chunk of memory. So, that's one way of operating this. DRAM is a cache in front of the persistent memory.
35:46 DM: The other method, which is really the one of most interest to what we're up to today, is this thing called App Direct Mode and this in App Direct Mode, this says the operating system and the application are aware that there's two types of memory in here. There's DRAM and there's Optane, Optane being persistent, and these are load-store memory. They act like memory, but they can also act like . . . At least, the Optane can act like storage, but each one can behave differently and appropriately. So, volatile DRAM is still volatile DRAM, but persistent Optane, now you have this persistence. And again, that one can go either memory or storage, but the operating system knows and the application knows, and that's the key thing. The application now knows that there's something different here and it can take advantage of it.
And then the third mode is a Mixed Mode where you just say, "Allocate some of it as Memory Mode and allocate some as App Direct Mode." So, really, you've got the two and as I said, App Direct Mode is the one that's of most interest to us, at least for this presentation, and this is where really the interesting stuff is happening.
36:53 DM: As I've got a link at the bottom there, Intel has a support matrix to tell you which OS handles Memory Mode and App Direct Mode and Mixed Mode. And so, I'm sure they update that from time to time and so you can click that link there and get the support matrix for Optane memory as it applies to Memory Mode and App Direct Mode.
All right, so let's see how App Direct Mode works. This picture . . . I'll just give credit now. This is a chart taken from the book called Programming Persistent Memory by Steve Scargall and this is something you can actually get for free. If you just look for that book, Programming Persistent Memory, there's a way to download it for free, so it's . . . because they're trying to . . . These are guys working with the persistent memory programming model. They're trying to get people to figure how this works. So, this is where this chart comes from. So, let me describe the text and let's look at the chart.
So, the idea here is, you're going to use a persistent memory-aware file system, but then you're going to use this thing known as memory mappings. So, you're going to map memory to it. So, you don't actually write it like storage, you just use the sort of operators the way you would, but it maps directly to memory.
38:08 DM: So, you get . . . Using these mappings, you go directly from your application right to the persistent memory and this is something known as direct access I/O, and they've coined the phrase DAX. And you'll see that any time you want to talk about something in App Direct Mode, you're going to use DAX. So, that's a keyword -- you'll see that in all the documentation. It's DAX.
Now, I'll mention a couple of things about DAX. On the Linux side, there's a couple of variations of DAX, but still, you need to know about DAX. That's kind of where you have to go and there's some other technical details that they get into there. On Windows, there's only one DAX mode, but you just use DAX. So, that's the key thing. You'll see that D-A-X acronym a lot. Now, as it turns out, with Optane, you can interleave them, just like you can NVDIMM-Ns and, again, you can set up two of them and stripe data across or four, whatever you like. I'm not sure how high you can go with that but, again, two or four are the common ones.
39:02 DM: So, again, looking at the chart on the right. The idea here is your application wants to put memory . . . Put data in this persistent memory, it goes straight through. There's no file system you're actually going through, it's just . . . You've got a memory map and it goes straight through. So, that's the idea there with App Direct Mode.
So, the next question is, "Well, then what apps can do this?" So, let's talk about that. Persistent memory-aware application examples that use App Direct Mode. There are some that can do the Memory Mode, but that's not of as much interest here -- we want to talk about what apps can actually do this today. Remember, going back to my original early experience as a programmer or as a developer, you have to know that some memory is temporary and some memory is permanent, so which apps can actually do this? So, in Microsoft SQL Server 2019, they have this feature called the hybrid buffer pool.
39:58 DM: This is the part of SQL Server that's persistent memory-aware, and that says you can use that persistent memory either for buffer pools or database files, which is, of course, where you want to keep your data. So, that's one example. There's another thing called Aerospike Enterprise Edition 4.8, it also is persistent memory-aware and in that version . . . They had an earlier version that did a little bit of this, but in 4.8 they got all of what you see here. You can now store database indexes and data in the persistent memory and it knows that that's persistent memory. So, one of the side effects there is that if your data is in persistent memory and you reboot the system, it doesn't have to reload the database because it's in persistent memory, so it's just right there. So, that's another application that has it.
40:44 DM: SAP HANA, they've done some things now to move portions of that data to take advantage of persistent memory. They use something called column store main, they've got that now they can take advantage of that. If you click that link, you'll get to a nice blog page where they talked about how they did that, and as it turns out, it works for Intel Optane and it works for some other stuff as well. But again, you can get the idea here, they've had to make changes to the app so that it knows about persistent memory.
One other one that's in preview mode right now, Oracle Database 20c, supports mapped buffer cache and persistent memory file store or PMem -- sometimes you'll see the word PMem attached to these things. It's in preview mode. It's not ready, it's not in production yet. So, they're saying, don't put this in production and the fact it's not released as a production product yet, but it's coming soon, and so that one also supports it.
41:36 DM: There are a few others that support it, but you can see it's going to take some time for everybody to get all this data or all their applications to really take advantage of persistent memory. This is a significant change from the way people are used to writing code now that you've got persistent memory. So, to really take advantage of it, you have to go in and modify it and it takes a while to get those changes done. So, this is where we are as of, let's call this end of October, early November 2020. There are others, these are just samples, these are not all of them, but I just wanted to give a few examples of what kind of apps are doing this.
All right, so now let's get into our tests. So, this is a description of our tests that we ran, we published this in August of this year. So, we took a Dell EMC PowerEdge R740xd server, and you can see we have some Intel Xeon Processors in there, and we basically just tested the performance of it with Optane memory and without it.
42:41 DM: So, just regular memory and then Optane, and I'll get to the specifics here in a second. We used something called HammerDB, so if you're a tester and you test databases, you'll know what HammerDB is -- it's a test utility that simulates a couple of different types of workloads. For this one, we picked the TPC-H-like data warehouse workload. A data warehouse workloads says, "I've got all this data in my database and I'm going to run these queries, and they're very complicated queries and they run in a certain order." And then, depending on how many you're doing, they'll come out in a different order. But it's the set of 22 queries that really say, give me lots of data all at once, and I'm going to do some analytics and it crunches on it a while and it comes up with some answers. So, that's what this workload does.
So, what we wanted to see is . . . There are two things about this test we wanted to see. No. 1 is how many simultaneous query streams could we process in roughly the same amount of time, so we're going to do an SSDFS SSD test versus a persistent memory.
43:43 DM: And then the second test is how long does it take to do just a single stream of 22 queries? So, this workload is a package of 22 queries, so you can say, "How long does it take to do one of them?" And, obviously, the fastest one is better. Faster is better or lower is better. And then the other one is how many streams can you run? If you run it one way, how many can you run, and then compare the other . . . Can you get more if you run it with Optane?
So, there's a link to our website. This will take you right to the report published in August 2020. I will say something about the way we do our reports. So, this link will take you to the main report, which describes the scenario -- it describes the test and it describes the results. And then at the end, we have a link to another report that we call "The Science," and this just says, here's everything we did to configure it, here's all the super-technical details that you want. Here's how we did the OS, here's how we did the . . . Here's the drives, here's all the technical details. So, if you want to get into the technical details and see exactly what we did to run it, then you look at the second report.
44:45 DM: But the main report is what we're going to talk about here, so let's get into that. So, we're going to do two things here, so let's talk about our server. So, again, we've got the Dell PowerEdge R740xd. We did enable two extra features there, one called performance mode and one called the LLC prefetcher. You can see the processors, the model, the core count, the clock speed, how much DRAM we had in the server and then how much persistent memory. And so, you can see we have a terabyte of persistent memory there, and there's the model number and so forth.
Now, when we ran these two tests, we only had this in there when we were running the Optane test. When we were running just the regular test, we didn't have it in there, that way there's no chance that it gets in the way or anything else happened. We're running on Windows Server 2019 Standard Version 1809. We were running SQL Server 2019 Enterprise, and as you recall from a slide or two ago, that was one of the applications that understands this. We're running this tool called HammerDB, and there's the scale factor for our tests. So, it's a TPC-H-like database, that's the data warehousing workload, at 3,000 scale factor. So, if you run these tests, you'll know it's a pretty good-sized test.
45:57 DM: All right, let's see, what else did we have? Oops. So, we set Optane to run in App Direct Mode. We said, "Definitely use the DAX feature." So, that's what we're after here. We're going to run it in App Direct Mode. We want the app to actually take advantage of it, we did set it up as interleaved, and we set it up this way, we have two App Direct persistent memory regions, and we just said, between the two of them, use all available space. And then from there, we created two volumes, one volume on each region, basically, so two volumes, 2 PMem volumes from those two regions.
46:32 DM: So, now we've got two large volumes running as persistent memory that the app could then point to. So, that's our setup here. Let's get a little bit more into the SQL Server part of it. So, we set our maximum server memory to 90% or 345 gig of the memory available in DRAM, so we said, "SQL Server, take up 90% of the memory. You can do whatever you need to do." We did enable the hybrid buffer pool option, so that's the part that uses persistent memory. We've got a few other technical things here, if you're a SQL Server guy, you might understand what this is. TempDB, that's the place where puts its scratch space and so forth. We said, set it to a gigabyte and it'll automatically grow by gigabyte as needed. We set the degree of parallel, I'm sorry, the maximum degree of parallelism to eight. We set the work, or threads to 3,000. Here's a interesting one you have to do, we set security policy to allow SQL Server to lock pages of memory because it's using memory now to store its database.
47:30 DM: We say, "Yeah, lock 'em in there, don't try to swap them out, just leave them in there because it's persistent." Or at least in the PMTest, it would be persistent. And here's where we put the database file, so this is the difference between the tests we ran. The first test, we said, "Put all the data on SSDs, it's a RAID 10 stripe across six SSDs." You see the model number there and so forth, those are a set as SSDs, whereas when we ran the Optane part, we said, "Put half of those database files in one of the persistent memory lines, put half on the other." And then the logs, because you always have . . . With databases, you have databases and logs, so on logs, we left those as a RAID 1 setup on some NVMe SSDs. You can see the model numbers there. So, we didn't touch the logs, all we did was move the database files from the SSDs over the persistent memory. So, that's our test.
48:20 DM: All right, so let's get into the result here, this is actually right out of the report, so it was really easy for me to grab this, so let's talk about the left side of this graph. This is Figure 1 in the report, so we said . . . The first thing we said was, "Can we get more simultaneous streams to run roughly the same amount of time if we use persistent memory as if not using persistent memory?" As it turns out, we could. So, we started out with eight because eight is a suggested number for this workload. This is how many it should start with, so that one ran fine, but we said, "How many more?" and we cranked it up to 12.
48:58 DM: So, we ran on the Optane memory with 12 streams, so 50% more, and it actually . . . Even that completed in a little less time than the eight streams did there. So, that just says you can run more stuff in parallel because the persistent memory is faster than the SSDs. So, that's our number there, so 50% more streams and about 3.2% faster time or less time to complete. So, whenever you're measuring time, of course, you want it to be less. I mean, better would be less, but on the number of streams, this is just raw throughput, you just get more done. You can run more of them at the same time. So, that's our first result.
49:42 DM: Our second result here, this is Figure 2 from the report. This just says, when we run the single stream, the 22-query stream, just one. Does it take less time to run, or is it about the same, or is it more? If you're using Optane, as it turns out, it was less time, so this is about 26% less time to run the single package of 22 streams using persistent memory as opposed to using the SSDs. So, that's a nice improvement as well, so let me just go back one here. So, more streams, 50% more streams, 12 versus eight, a little bit less time, in 3% less time, and then the single 22-query package ran in nine minutes less, or about 26% less. So, those are the two basic things. So, that just tells you that persistent memory really does make a difference. You get more things done and you get more of them done in less time, so that's the idea there.
50:42 DM: All right, that's the end of my slides here, so if you have any questions, you can definitely look me up on LinkedIn. I'm not posting my email address here only because I don't want it to be scraped, and then I get lots of spam email because my old email address when I was doing my own company, that's what happened, so I'm not going to do that here. But yes, look me up on LinkedIn, if you click that link, that'll take you to my page, just look for Dennis Martin on LinkedIn.
You can also visit our company website, at Principled Technologies, and we also are on the social media sites that you see there -- Twitter, Facebook, LinkedIn and YouTube. So, we post all of our things there, all our public things are there.
So, with that, I want to thank you for your time and wish you a good rest of the conference. And, again, I'd love to answer any questions you have, either about this or persistent memory in general. Thanks very much. And we'll catch you later. Thanks a lot.