Technical and Market Directions for Persistent Memory
Learn how persistent memory is evolving, how it is maximizing performance in next generation applications, and explore market growth projections.
Download this presentation: Technical and Market Directions for Persistent Memory
00:10 Ginger Gilsdorf: Hello and welcome. My name is Ginger Gilsdorf, and I'm a software engineer at a hardware company. I've actually been with Intel since about 2015 right when Intel and Micron announced their collaboration around persistent memory, so I've seen the formative years of this technology, including working with a few database and analytics companies helping them adopt persistent memory in their data architecture. So, I wanted to share a few of those key learnings with you, and since time is really short, we'll go ahead and get started right away.
00:44 GG: I apologize for throwing such a busy diagram up at the beginning, but I did want to start with a good abstract view of the hardware resources that I'll be discussing. So in general, with your data architecture, you're trying to get data to the CPU so that it can actually be useful, and we know that in general, as we move away from the CPU to other resources like DRAM and different types of storage, those resources are cheaper and larger, but also slower. That's one of the trade-offs that we do. Let's add persistence to the picture. We know that the data to the right of the persistent boundary, anything in DRAM or the CPU is volatile.
01:30 GG: If you want that data to survive a power failure or a restart or just to persist across time, you do need to make sure that it gets to storage, even as you're operating on it in the CPU, and that has a pretty high cost in terms of time and performance.
Now, let's add persistent memory to the picture. It gets a little bit more complicated. Persistent memory is close to the CPU, much closer than storage, there's still a gap between persistent memory and storage. It can actually be either volatile or persistent. If you're using persistent memory in the persistent fashion, then what you've done is actually move that persistence boundary that much closer to the CPU, and there's a lot of really great benefits that you can get when you do that.
02:25 GG: As I was designing this diagram, it made me think of the fact that all of the resources on the system are stationary, they're not going anywhere, but you as an application designer make choices as to where you want that data to move all around the system which made me think about how animals in nature migrate. The natural resources are out there, they're stationary for the most part, and animals move in order to get to the resources that they want.
03:00 GG: So what I did is I took three different animals, I want to show you a few things about how they migrate, make an analogy to a data structure or an application that would fit with that pattern, and then give you a couple of examples of each pattern. So, I'll start with the monarch butterfly. This is a pretty famous migrator just because of the fact that they can migrate even up to 3,000 miles at a time. Sometimes they've been known to even cross the Atlantic Ocean, and they do this annually. The motivation when winter is coming, of course, is to move to a warmer climate.
03:44 GG: So, this makes me think of an application that I would call a performance hawk. This is something that does need data persistence, but storage devices are just simply too slow. So, the motivation to migrate that type of data structure application is that you want the data to be warmer in the sense that it's closer to the CPU. So, in this case, what you would be doing to migrate is to move from a storage device to persistent memory. And there's a couple of ways to do this, but I'm going to talk about the one where you treat persistent memory as if it is a storage device. You mount it on your system, you put a file system on it, and you just use it as if it was a regular SSD.
04:35 GG: The reason you would do this is that, yes, the persistent memory devices are faster than SSDs for sure, and also there's no real software modification you need to do in order to use persistent memory this way. You probably would want to do some tuning just to make sure that everything performs as best as it can, but you don't actually need to modify the way your software is operating. So, I'm going to give you a few case studies, and most of these are from real applications out there.
05:07 GG: The first one that has used this type of data migration is a time series database. And before persistent memory, they were using an SSD when their memory spilled over, so they were doing some sort of queries and calculations and any time they ran out of memory, they were paging to SSD. When they switched to using persistent memory instead, this really, really reduced the query latency which then, of course, improves performance because you can get an answer that much quicker, you can do a lot more with your application.
05:48 GG: Another case study is a data orchestration service that has a tiered cache. So, they have the option to store cache data in DRAM and SSDs. And when they added the option to cache in persistent memory, this gave their customers more choice, which is great, and this tier is faster than SSD. So again, it comes down to better performance.
06:18 GG: Alright, the next animal that I wanted to share is the shark. Like the monarch butterfly, some of them will travel thousands of miles every year, but the others that I thought were more interesting are ones that will do sort of a daily vertical migration between deeper and shallower areas of water, and they do this partly to find food, but also it helps them stick with their familiar environment. They're migrating up and down, but they're not moving their location in the ocean. And the tie-in to this animal is what I'm . . . The application that I'm going to call the memory hawk.
06:58 GG: Currently, this is an application or data structure that resides in memory because you need that fast access, but you're probably finding out that memory sizes are too small and or too expensive and hoping that there would be a better solution. So, the motivation for this data structure to move would be to get better performance and or reduce cost, but still retain that familiar environment. You're not making a huge leap, you're still able to program the same way you've been doing, but just getting an extra larger capacity.
07:38 GG: So, like I was mentioning, this is the data structure that you would move from memory to the volatile pool of persistent memory. As I've shown in the diagram on the right, the application has a DRAM cache, but it's getting data from the volatile pool of persistent memory. This allows you to expand your total system memory resources, have a larger working area of data. You don't have to do any software modifications and any of your frequently accessed data is still going to be in DRAM, it's just a cache, so that when you run out of DRAM and you need to get data, you go to persistent memory, which is faster than going to SSD.
08:30 GG: So, one of the case studies that we've seen with this type of data migration is an in-memory database that can scale out to multiple nodes. With persistent memory, this database can actually support the same data set using fewer nodes because each system has a larger pool of memory, then you need fewer nodes to store the data, so of course that's going to be a cost savings for any customers using that database.
09:02 GG: Another case study is a search engine application that stores large tables of pre-computed data on the documents in their library. These are tables that you don't want to calculate on the fly, so you need to store them. If you use persistent memory in this type of an application, you've got a larger pool of memory to store from, and since you can actually get that table in memory instead of maybe on disc, you're going to get a faster response to any queries.
09:37 GG: So, these are two examples of where you might benefit from having a larger pool of volatile memory. Finally, one of the more interesting ones is the wildebeest. Now they travel across the African savanna yearly, and as I was reading about them, it says that millions of them just kind of get up and move all at once, and they somehow know when it's time to migrate. The motivation for them is, like I said, greener pastures, better resources, better food. So, this brings me to the data structure or application I'm calling the hybrid. This is something that currently resides in memory for fast access, but persistence does add some value, and I'll explain a little bit more about that in the next little bit.
10:29 GG: So, the motivation here for movement is to get better memory and that it's persistent and better storage, and that it's faster. So, it's a hybrid of memory and storage but the tie-in here is the wildebeest and the hybrid are both looking for something that's better than what they've got right now. So, this is the case where you actually do need to do software modifications, moving from a data structure from memory to persistent memory. The fastest way to do this is to memory map your persistent memory to the application's address space and this allows you to get direct load store access bypassing the kernel and the cache line granularities.
11:20 GG: So this is really, really fast, especially compared to things like SSD, but you do need to worry about making sure that when you do a store to your data that that store gets persisted to the device, not left in the cache. There's plenty of resources that you can look up online about that, but that is one thing that you do need to pay attention to as you're working with this type of data migration. So, I mentioned that one of the keys to the hybrid is that you want fast access, but you also need persistence or persistence has some sort of value.
12:04 GG: So, there's an in-memory database that has . . . Part of their architecture is a large amount of table data stored in memory. When they switched that from memory to persistent memory, this allowed them to not only have larger tables, but also when they restart the system, the table is still there, and they don't have to go through and rebuild that data like they did when they had their data in memory. So, it's a savings in terms of time, and it also allows you to have larger tables in memory.
12:44 GG: Another one, this is somewhat similar, but a key-value database. The keys are stored in memory for, again, for that speed, but when you put those keys in persistent memory, again, you get the same savings on system restart, and again, you're able to store more keys per node, which usually allows you to store more data per node. So, this is a scenario where you want something that's really fast, but you have a reason that persistence adds value to the picture. It's a little bit more work to modify your software for this, but you're truly getting the best benefit of the technology with this.
13:28 GG: So to end up, now that you've seen a couple of different examples of data migration, you should be able to identify some of your own data migration motivations, keeping in mind all of those different hardware resources that are available to you and the reasons that you might want to migrate from one area to another. SNIA, the Storage Networking Industry Association, has a great website with lots of information to get you started, and then pmem.io has another great set of information for persistent memory programming. My information is there if you would like to contact me. I'm happy to answer questions.
14:12 GG: And I'm also... I'm participating in the panel that's coming up at 10:45 AM, and it is live, a live Q&A panel. So, if you have any questions, I'm happy to see you there. Thank you. And happy data migration.
14:30 Tom Coughlin: Hello, I'm Tom Coughlin, and I'm glad to talk to you at the 2020 Flash Memory Summit. So, I've been involved in digital storage for over 40 years as an engineer, as an engineering manager, as an executive, and I've been running my consulting firm Coughlin Associates for over 20 years now. I've been engaged in engineering and market analysis, due diligence for investors, a manner of different things that I do, technical work. I've also been involved in writing reports on digital storage technologies and their application, including one upon which this talk is based, and I've been a founder of the Storage Visions and Creative Storage Conference, and general chairman of the Flash Memory Summit for 10 years and also been involved in a number of other events as well.
15:23 TC: And I'm glad to be here today to talk to you about market directions for persistent memory. And let's talk first about why emerging persistent memories are necessary today. The biggest reason probably is that flash can't scale with the process advances. That's why NAND flash went 3D at 15 nanometers. But 3D is not cost effective in a CMOS logic process, so it's difficult to use it in embedded applications, NOR scaling stopped with FinFET. 20 nanometer and smaller processes need something new.
15:58 TC: SRAM scaling may also be reaching its limit, and we'll talk about that in the next slide. DRAM, a common memory that consumes a lot of power because of its refreshes, and that can be an issue if you're in a low energy or battery-powered application. So, standalone persistent memory applications are growing for a lot of different reasons, both in the data center and for consumer applications, and low power, high-density memory is needed, especially in these embedded applications.
16:29 TC: Semiconductor process shrinks have worked for decades to reduce costs. This is illustrated in the left chart from my colleague, Jim Handy. The process runs along the bottom shrinking as we move to the right. Relative cost on the vertical axis has historically come down with every process shrink but flash memory has a problem. Flash is unable to scale past a certain point. For NAND flash, this was 15 nanometers, so 3D NAND was adopted. NOR flash, the only non-volatile memory currently used to store the code in microcontrollers, A6 and other SoCs is unable to scale at all past 28 nanometers. Its cost declines, indicated by the red line, cease at this point. To address these issues, manufacturers are migrating to new memory technologies. Emerging memories are needed to allow further cost reductions.
17:20 TC: These emerging memory technologies are represented by the black line on the chart to the left. Today they're more costly than NOR flash, but they will eventually cost less than flash to produce since they can scale past 28 nanometers. The chart on the right plots the cell area for SRAM cells that were presented in IEEE conference over the last few years. Although the cell size has been shrinking to keep pace with the process geometry, which runs along the horizontal axis, it suddenly stopped shrinking as the process transitioned from 22 to 14 nanometers.
17:54 TC: In addition, SRAM is much larger than the persistent memory candidates. And let's meet these persistent memory candidates. They all share the same attributes, they all have a small single element bit sell that promises to scale smaller than current technologies to support small, inexpensive die and 3D stacking. They also promise to be easier to use than flash memory by supporting right in place with no need for a block erase, and they have more symmetrical read-write speeds. Finally, they're all non-volatile or persistent. Data doesn't disappear when power is lost, they can all be used as persistent memory. These new memories are necessary to continue the scaling that we've come to expect from semiconductor devices.
18:38 TC: In the technologies that are shown here, MRAM, phase change memory, resistive RAM, and ferroelectric RAM. Some examples of products that are out there, there's a company called Everspin which has shipped over 120 million standalone MRAM chips. This company has partnered with Global Foundries, a major semiconductor foundry company who is building 300 nanometer wafers to support Everspin but also targeting these wafers for embedded memory applications. All the major foundry companies -- TSMC, Samsung and others -- are starting to ship spin-tunnel torque MRAM products.
19:17 TC: And there have already been products being shown, for instance, IBM at the 2018 MRAM Developers conference just before the Flash Memory Summit. They were showing an Everspin MRAM write cache for an SSD. Phase change memory, the best example of that today is Intel's Optane NVMe, which started shipping SSD products in 2017, and they started Optane DIMM-based products for the memory channel in June 2018. Started shipping those in 2019 and are shipping today. Both products are currently used in storage systems.
19:54 TC: The acceptance of an emerging memory technology depends upon its pricing. To capture a mainstream market, a memory must be cost competitive against entrenched technologies. This cost structure can only be achieved through high volume shipments. Products that don't reach this volume must play in niches because of their higher price. A good example of this is the Micron Intel 3D XPoint memory. This chart represents computes, memory storage hierarchy. For a memory or storage technology to make sense of this hierarchy, it must be cheaper than the next fastest technology, the price is shown on the horizontal axis, and faster than the next cheapest technology where the performance is shown on the vertical axis.
20:35 TC: Intel is trying to get 3D XPoint adopted into this memory hierarchy. 3D XPoint is already faster than flash, but slower than DRAM. But to fit into the hierarchy as shown on the chart, it must be priced below DRAM, otherwise, Intel's customers would just buy DRAM. Although Intel probably lost well over $1 billion per year on Optane 3D XPoint while they ramped it up over the past three years, as of the last quarter, it reported no losses probably because it started to reach production volume that lowered the costs below the price it must sell for. Intel had to overcome a chicken and the egg problem, 3D XPoint must be cheap to sell in volume, but 3D XPoint must sell in volume to get cheap.
21:19 TC: The same problem stands in the way of other emerging memories. NOR's hard stop at 20 nanometer and the size of SRAM opens the door for other technologies to take over as the embedded non-volatile memory and system on chips. Today, MRAM is the leading candidate in this race.
I'm going to talk to you a bit more about MRAM. MRAM saves power while reducing a chip's size and cost, what we're showing here in comparison with SRAM. Embedded MRAM will shrink smaller than SRAM as well as NOR flash. A 1T 2T MRAM bit cell is smaller than the standard 6T SRAM bit cell in NOR flash. But NOR flash hits scaling limits after 28 nanometers as we said. MRAM could replace embedded SRAM in NOR since it offers lower power, lower cost and higher density as shown here.
22:11 TC: The type of MRAM that is used for a particular application can be tuned, optimizing for its data retention, endurance and capacity. Microcontrollers, A6 and other SoCs will lead in the use of persistent memory in embedded devices. Their need for a NOR replacement has already driven major foundries, TSMC, Samsung, Global Foundries and others to develop emerging memory technologies, mainly MRAM and resistive RAM to replace a stalled NOR flash technology. In addition, MRAM is much smaller than SRAM, allowing more memory using MRAM than SRAM on a given size die.
22:49 TC: One of TSMC's customers, Ambiq, is developing chips that will enable the next generation of battery-powered, always on, voice recognition IoT endpoint devices. The company's fourth generation Apollo SoC family sets new standards for low power intelligent endpoint IoT devices. This chip is implemented with TSMC'S 22 nanometer process. The SoC chip achieves power use as low as three micro amps per megahertz by using MRAM memory with low deep sleep current loads and operates up to 192 megahertz clock frequency.
23:26 TC: The figure here shows the block diagram for this chip with various features, including two megabytes of MRAM along with up to one megabyte of SRAM. The Bluetooth or BLE radio blocks are in a Bluetooth version of this chip. These foundries will drive the cost out of persistent memory technologies, rendering them attractive as a replacement for SRAM, first slower SRAM and then the fast SRAM and upper level caches.
23:56 TC: When this happens, persistence will follow. Not only participating in the 3D XPoint and below technologies, but now moving into caches where caches themselves become persistent. So we could see a lot of reasons why there's going to be growth in these persistent memory technologies, and this chart comes from a recent report that Jim and I published, Jim Handy and I published called "Emerging Memories find their Direction." The total emerging memory market, MRAM plus 3D XPoint could exceed $36 billion by 2030.
24:33 TC: This chart shows how petabyte shipment growth could drive that number. It also compares the current and projected capacity shipments for the next 10 years for various emerging non-volatile memory technologies versus DRAM and NAND. Assumptions behind the chart are then embedded MRAM replaces at most, SoC, NOR and SRAM and that MRAM has strong appeal for AI applications. And of course, 3D XPoint is being driven by server and storage system applications. Note that MRAM shipment growth in this chart includes both standalone as well as embedded MRAM technology.
25:10 TC: This chart, also from our latest report, shows the capital spending requirements to support the emerging memory growth that I just showed. MRAM and other emerging memory applications require both standard and specialized tooling, including test, ion etch, patterning and physical vapor deposition equipment. The chart forecast is capital spending up to 2030. Our baseline estimate is for emerging memory capital expenditures to add $700 million to the semiconductor equipment market annually by 2030. So, to summarize, lithographic scaling is limiting the use of NOR flash and SRAM for high density embedded devices. Persistent memories that scale beyond NOR and flash and SRAM are now available, and these will change the storage memory hierarchy.
26:00 TC: MRAM will grow from embedded low power applications, for instance, running AI and machine learning local inference engines and persistent memories will eventually replace SRAM cache memory. MRAM and phase-change memory use will generate over $36 billion in annual revenue by 2030 and capital spending for MRAM will exceed $700 million annually by 2030. So much of the information in this presentation was drawn from a newly released report on emerging memories. The report that this data comes from is available to purchase online. It describes the entire emerging memory ecosystem, the technologies, phase change memory, resistive RAM, magnetic random access and ferroelectric random access memory, the companies, the markets and support requirements to enable these.
26:49 TC: Forecasts examine emerging memory consumption, both embedded emerging memories and discrete emerging memories. It's 201 pages long with 31 tables and 142 figures. And you can go to the URLs at the bottom of this slide if you wish to learn more. So, with that, I'd like to thank you, there's my website for your information, but thank you for listening to me today, and I'd be glad to answer any questions that you may have.