Big Memory Software
Take a look at software-defined memory, combining persistent memory and DRAM.
Download this presentation: Big Memory Software
00:00 Charles Fan: Good afternoon, everyone. Thank you for joining this session. My name is Charles Fan and I'm co-founder and CEO of MemVerge. Today, I will introduce software-defined memory, combining persistent memory and DRAM, and go over the product and technology we have been developing over the last four years.
So, this is today's computer. Whether you're looking at a server in a data center or a PC on your desk or a phone in your hand, they all follow this von Neumann model where the application is running on the CPU, the computing unit, the data is being placed in the main memory and the CPU and the memory interacting together to support the application.
00:46 CF: When the data cannot fit in the memory or when you need to persist data for the long term, external storage is involved. And the data is being placed into the external storage for long-term retention. And this movement of data from memory to external storage or from storage to memory is called I/O -- input and output of data in a computer system. And memory is fast. It typically has tens of nanoseconds of latency that allows low store operation where, if you're writing a program, the program actually waits for the data to be commit to memory before continuing. Such as when you say X=1, you put a 1 into a memory unit, and then you go on to the next one.
01:36 CF: And whereas storage, it is much slower, typically three orders of magnitude slower. And it's happening asynchronously. You typically open the file-handle, you write your chunk of data into storage, and you read in the same way. And the latency is typically hundreds of microseconds of latency. So, there is a somewhere around three order magnitude or one thousand times difference in terms of speed. But at the same time, storage have a certain advantage over memory in its capacity. We have some very good scale, our storage today that can go easily to petabytes of data to be placed in a single storage system, where in a memory, it's typically hundreds of gigabytes.
This article is part of
Flash Memory Summit 2020 Sessions From Day One
02:23 CF: So, there's a big difference in terms of capacity, in terms of cost where storage is much cheaper, and very importantly, storage is persistent. Therefore, you can build various kind of data services such as snapshot, such as replication, such as deduplication, that can store data for the long-term in an efficient manner. Where in memory just to satisfy the speed, it's typically small in capacity, expensive and volatile. So, it's really the combination of these two type of media that's supporting the computing industry today. And it has been such for the last 50 years. But in the last 10 years, the world is becoming increasingly data centric. There are new sets of applications that are driving the new generations of application design and data center architecture.
03:20 CF: In particular, we are seeing the rise of big data analytics. And in more recent years, more and more real-time big data analytics, where the insights and actions need to be taken within nanoseconds of the discovery of data. We are also witnessing the rise of artificial intelligence and machine learning, where more and more data are being processed and the system learn themselves and to be able to act on their own. And this also requires an infrastructure that can process data at both higher capacity and higher speed. And many of the traditional applications -- such as trading application for financials, such as seismic and analytics and genomic analytics for high-performance computing or HPC -- would also continue to grow both in capacity and velocity of data.
04:15 CF: So, all this is driving a new type of infrastructure. On the bottom right on this slide, you'll see a prediction from IDC where it's clearly indicating the exponential growth of all data. And more interestingly, the yellow line is the indication of amount of real-time data as a percentage of all data, and that is increasing. That's indicating real-time data is growing even at a faster speed than the general data. And it's going to reach about 25% of all data by 2024.
And for the world's enterprises, more and more mission-critical applications are needing to use real-time data. But that means the requirement not is only in capacity and velocity but also in the level of mission-criticalness of the applications. And so this is putting a stress onto the infrastructure because the memory is fast enough but not big enough, while the storage is big enough but not fast enough. So, what could be a possible solution that can really address this increasing challenge?
05:35 CF: Now, the good news is that over the last 18 months, we are seeing the introduction of a new type of media to the world of memory. And this is namely the persistent memory that Intel pioneered. We are expecting other vendors to join this market over the next couple of years, so this will become a new market of a memory that Intel calls it the Optane persistent memory. And the properties of this memory is that it is bigger than DRAM but smaller than SSD. It is more expensive than SSD, but cheaper than DRAM. So, it's somewhere sitting between DRAM and the SSD, but it is much closer to DRAM than SSD. It is about five times slower than DRAM, and it's more than 100 times faster than the fastest storage device. So, it can be used at memory speed.
06:44 CF: And even more importantly, this new type of memory is persistent or non-volatile. That means the data being placed there can survive power cycles and they will stay there even after you turned off the computer and turning it on again.
So, this opened up many new possibilities in terms of how the application can be run. And what we are proposing at MemVerge in collaboration with Intel and other manufacturers of the new type of memory is to producing a new architecture, we call it Big Memory Computing. And how Big Memory Computing works is that we're envisioning a world where the entire application can be run in memory without needing to incur I/O, and to use the external storage system. So, the volume and architecture that we have been used to in the last 75 years and the DRAM-only memory that we have been used to in the last 50 years will all change in the next few years.
07:44 CF: And there will be this new concept of Big Memory that's going to be created with a combination of DRAM, PMEM, as well as additional memory media. And they will be harnessed together by a layer of memory virtualization that we call Big Memory software. And with the software combining these two types of memory and making the combination of their capacity and their capabilities available through the applications that we can achieve the best of both worlds. We can bring the performance of the combined memory to the DRAM level while offering larger capacity, lower cost, and persistence that make it possible to build storage-like data services directly on top of memory that can address all data needs from the applications without requiring the I/Os that's needed to access a storage system.
08:47 CF: So, this is the concept that we are proposing, and we believe this will be a catalyst in driving a fast growth of a persistent memory market. This is another IDC prediction, on the right-hand side, where in the next three or four years that they are predicting a pretty dramatic growth of this new type of media at 248% year to year to $2.6 billion of revenue just on the hardware alone, just on the byte-addressable media alone. And this doesn't include the application of storage plus memory to storage devices; this is just storage plus memory as memory devices. There is another prediction, Forbes showed that the amount of the persistent memory will exceed DRAM amount by year 2029, and the revenue of persistent memory to reach $25 billion just for persistent memory in the next 10 years.
09:51 CF: So, this is different analysts predicting different curves, but invariably, they are growing very fast. And it is this layer of software, we believe, that can really help the adoption of this technology while maintaining compatibility with the existing applications. And at MemVerge over the last three and a half years, we developed the first of such Big Memory software that we call Memory Machine. And how Memory Machine works is exactly as I described in the last slide. We virtualize DRAM and obtain memory and we create these little memory machines, and these are software-defined memory services available to each of the application processes, and each of these memory machines consist of DRAM and PMEM. And the ratio between the two type of memories are dynamically reconfigurable.
10:51 CF: So, by finding the optimal combination of DRAM and PMEM, we can offer the software memory service at the same speed as DRAM service through each of these applications while delivering larger capacity and lower cost. So, the benefit of this memory service . . . The first benefit is that it offers a lower cost memory service at a larger scale without sacrificing performance. And this becomes a strong TCO play where the customer can reduce the total cost of ownership without changing their applications. And this is really the first value proposition Memory Machine product can deliver to the customers. In addition to the software-defined memory service, we also developed advanced data services with our Memory Machine. And the first of such data services is zero I/O in-memory snapshot.
11:49 CF: And what this snapshot does is it attracts all the application state in-memory, both the application itself and all the necessary memory state in the operating system. And it captures all those states, and when the operator of the application decide it needs to take a snapshot, or when the scheduled time of a snapshot comes, you'll capture all those state and having them committed onto the persistent memory without moving them to a persistent media on a storage system.
So, that allows this to happen at a very fast speed with a minimal disruption through the applications. And after the first of such snapshot is taken, and that usually takes less than a second to complete, the application can continue to run and these snapshots can be taken again. And when the snapshot is taken again, it's not overriding the previous snapshot but adding additional commitment of the new change to memory pages that we also remember.
12:55 CF: And it can do it so repeatedly up to 128 times, so you can keep 128 independent snapshot in the past without replicating memory so many times because we're only keeping track of the memory pages that have been changed, and this really allows a number of very powerful features from this Memory Machine.
First, you can roll back to any of the previous snapshot. This essentially allow the application to time travel, and if you made a human error, you can move to a snapshot that before you made that human error. It's very similar to, say, when I'm editing a PowerPoint, you have to have an auto-save feature, so no matter what mistake I've made, I can always go back to the version before I made such a mistake.
13:40 CF: And this can also protect against power failures, similar to auto-save, as well. So, if you lose your application because you lose the power, when it comes back, the application can automatically recover to the last known good snapshot of itself as well. And it can allow migrations. So, once a snapshot is taken, it become a unit that can be transported from a server to another, from a virtual machine to another, or from a data center to another, from on-prem to the cloud, for example.
So, this provides another method that you can move an application dynamically and recover right back to the point in time where the snapshot is taken. And last but not the least, it can enable a really powerful and cool feature we call cloning.
14:34 CF: So, an application instance, a new application instance can be created based on a previous snapshot independent of the current running process. So, you can have multiple independent processes of the same application, and while they could be sharing some of the physical memories without impacting each other's operation. So, having this data service available on memory for the first time, many of these interesting capabilities that was not possible before, now it becomes possible.
And most interestingly, this also doesn't require any application change. The existing application can run as is, and all these snapshotting and recovery and migration and cloning can be all managed from our graphical user interface or from our command line. So, this is fully compatible with all the applications running on the system before.
15:34 CF: So, this is an introduction of our product that we are very proud of, and we just shipped this product on September 23rd this year, so this product is roughly less than two months old, and we are getting our first set of customers using it. Over time, we believe the Big Memory Computing will sweep across all data centers and all applications around the world in cloud and on-prem.
But as a new startup, we have to focus on a number of industries to really make customers successful one by one. And then the first three industries that we found the early tractions and that we are focusing on are cloud service providers where a Big Memory platform had delivered higher amount of memory at a lower cost without sacrificing performance that will enable higher VM density per server.
16:39 CF: So, they can have more VM per server, therefore reducing the per-server, per-VM cost and that can increase the margin of these clouds service providers or lower the cost of their customers. So, that's the first use case. And then we also found a lot of tractions with financial services as well as animation studios, computer graphics folks who are having really big movies being edited on those workstations and render down those servers by offering bigger memory as well as the crash recovery capabilities of our Memory Machine, that we can help both the animators as well as the financial knowledge workers who are working with those in-memory databases for the trading information, to not lose time when the system crashes or when they made a mistake.
17:34 CF: The recovery, instead of hours before us, now it can be reduced to seconds. Where if you have a physical failure, the system can be back to work within seconds, therefore increase the availability of the system and increase the business continuity of the system and really save the dollars you could have lost for the downtime that we avoid it.
So, these are the early use cases and the early adopters in these first three industries. And we are actually starting looking at additional industries, including some of the HPC workloads both for seismic and genomic computing as well as AI and ML workload, as well. So, in the next few minutes, let me show you some of the real numbers we have been getting from these customers on these use cases, in-memory databases, the cloud service providing the VM services as well as something newer -- AI machine learning use cases and how they run on top of our Memory Machine.
18:42 CF: The first example is a customer of ours, a cloud service provider, running Memory Machine in their cloud infrastructure, where in this case, it's a KVM hypervisor running on Memory Machine. And they are running MySQL inside of those virtual machines, which is one of the most popular application, MySQL being the most popular database by deployments, and we're running a standard benchmark, Sysbench, on this database. The red bar is a baseline, you're running it on DRAM without us or Optane being involved. And you see, we got 47,000 query per second, and that's the number we try to meet.
Now, all the blue bars are with our software and with different ratios between PMEM and DRAM. The first blue bar, you see 37,000, we are slower than DRAM. And this is not surprising because in this configuration we are only using persistent memory without DRAM and persistent memory in itself is slow.
19:48 CF: And no matter how good a job we do in optimization, we just couldn't make it faster. So, that's 37,000. Now, as we starting adding DRAM to the mix, so we have some ratios between PMEM and DRAM, and putting them together as a software-defined memory service available to the MySQL, then we can get better and better and better. And all the way when we have 16 gigabytes of DRAM added to the persistent memory, we get higher performance than DRAM, which is a pleasant surprise.
We are hoping to get performance very close to DRAM, but in this case, due to the software optimization we have done in our Memory Machine layer, we actually exceeded the bare-metal DRAM speed. And because the bare-metal DRAM in the baseline case is managed by an operating system where we bypass the operating system and manage the memory, and we could be more aggressively optimized for these data-centric applications.
20:42 CF: So, in here, you get the same performance, or in this case, even a little better, and you get lower cost. So, with this ratio, you can roughly get 30-some percent cost savings on your memory and that's a big deal for these cloud service providers. So, that's the first use case, really demonstrating we're providing bigger, cheaper memory, without losing performance.
The second use case, as I mentioned, is using the snapshot capability for our financial customers, for whom in-memory database is a very popular application. And we are working actually with a number of customers on this, and this is a typical use case from this customer, which is Redis, one of the most popular in-memory key-value stores in the world. And Redis does have its own restore capabilities from storage system. It can save all the data in-memory unto storage and it can restore from storage whenever needed, either after a crash or for whatever other reasons.
21:48 CF: And if you have 300 million keys, 315 gigabytes of data in memory, it would take a number of minutes to save them onto storage, and it'll take a number of minutes to restore them from storage back into memory and running again. And in this case, it takes about 15 minutes for the recovery, restoration from the storage, to be, to move this 315 gigabytes back in memory and get it running again.
So, with our snapshot, you no longer need to do this, you no longer need to move it to storage. You can just use our snapshot to do it and recovery takes about half a second. So, it's a dramatic 1,500 times improvement in the speed that you can restore an in-memory database. And that's obviously of value to a number of these customers where speed is important. So, that's a second use case where we demonstrate our value there.
22:44 CF: The third use case is the more next-gen application -- in this case, a facial recognition inference AI program. And here, we are offering a bigger memory where not only the model that have been trained to do these facial recognition, but also the embeddings and the future libraries can all be fit in-memory rather than, before us, these data can tend to be very big and they don't fit into the memory, and some of them will need to be placed onto disks. And then you will see, with our bigger memory solution, we avoid data being moved to the disks, to the storage system. If the data, some of the data goes to storage system, the performance of the systems drop severely. And with our large memory system, that is no longer necessary to go to the storage.
23:40 CF: So, the throughput, transaction per second, as well as latency, all improved significantly. The transaction per second increased about four times, and latency is improved about 100 times in terms of how fast you can recognize a face using this system.
So, these are the three examples of how customers can be using our Memory Machine combined with a heterogeneous memory to deliver the capacity, the cost, and taking advantage of the underlying persistence to provide higher availability for their applications. And long term, as I mentioned, I think this is going to be a very significant change to the data center.
On the left-hand side is the data center today. You have compute, typically CPUs and GPUs. And you have memory, typically, DRAM today. You have performance storage, these are the all-flash systems that leading storage vendors supply.
24:46 CF: And you have capacity storage, these are secondary storage, a number of storage vendors supply and some of them go through the . . . some of the cloud . . . all of the cloud services supply those as well. And they're all interconnected through various kinds of networking. And that in itself is about half a trillion dollars in terms of the infrastructure here.
And there's a whole value chain that all the way from the application running on top of it, to the system-level software through the hardware systems and to the hardware media and hardware chips, all the way through. And it's a vibrant and changing industry. And one direction we think it's going to change, driven by the real-time data needs from the applications and driven by the innovations in hardware and the software, is the increased memory centrality of our data center.
25:49 CF: We believe the memory layer will become a bigger percentage of this overall data center spend. And memory . . . Overall, DRAM is about $100 billion today. We believe it's going to grow in size. Now, there are more type of memory inside. And we believe the future applications after this happens will be living in memory. Imagine if you have tens, or even hundred, terabytes of memory available to you per server, there are much more interesting things you could do. Especially this main memory can be made highly available, too.
And we believe this is going to cut into the market for performance storage, where the storage is going to be delegated just to handle the capacity. And the capacity, there'll be more NAND flash SSDs as opposed to hard drives that's today, but the SSD-driven performance storage, we believe, will be replaced by a Big Memory service powered by the new type of storage class memories.
26:53 CF: So, that's our vision of the future, and then we believe new markets of tens of billions of dollars will be created both in hardware as well as in software. And this is quite relevant to this conference and to this audience today. We are the first to introducing this Big Memory software layer, but we don't want to be the only one. And so we are working hard to collaborate well with all the hardware suppliers, and we are welcoming other software vendors to join the space to really create this new market and to facilitate this Big Memory movement that we believe will happen. Yeah, OK.
That's it from us, from us at MemVerge. We hope you share the excitement we have, and we hope you enjoy the rest of this track, and we'll talk to you in more details in the panel where we're going to be joined by some of our partners and some of our customers to discuss more into the excitement of Big Memory Computing. Thank you very much.