How to determine if an in-memory DBMS is right for your company
Database expert Craig S. Mullins examines the pros and cons of the in-memory database management system, and the criteria you should consider during the request-for-proposal and evaluation period.
In-memory database technology represents a burgeoning trend in the database management system (DBMS) marketplace. Although the concept of processing data in computer memory isn't new, the approaches and technologies adopted by both recent and existing DBMSes are.
An in-memory database management system (in-memory DBMS or IMDBMS) -- also known as a main memory database system or memory resident DBMS -- predominantly relies on main memory for data storage, management and manipulation.
The traditional storage mechanism for a DBMS is disk storage. Most DBMSes move data from disk to memory in a cache (or buffer pool) when the data is accessed. Moving the data to memory makes subsequent access to data more efficient. But the constant movement from disk to memory and back can cause performance issues.
So, the primary use case for an in-memory database management system is to improve the performance of the queries and applications that access the data. An in-memory DBMS can also benefit from a reduced instruction set because fewer activities are required to access data (as opposed to accessing it from disk).
NewSQL database systems are another offshoot of the current in-memory and NoSQL trends. The concept of NewSQL is to adopt the market momentum of NoSQL with modern database architectures, configurations and implementations, but to support SQL -- thereby leveraging the knowledge of the huge pool of SQL developers. Not all NewSQL DBMS products are in-memory, but many are.
Market factors for in-memory databases
Since the notion of speeding up processes by using memory isn't new, what factors are causing the in-memory DBMS trend?
First and foremost, the supporting technology is becoming more widely adopted and less expensive. The amount of memory available on today's servers continues to expand, with many servers having 32 terabytes (TB) of memory or more. Furthermore, the cost of memory continues to decline, with 8 GB of memory available for $50 or less today, versus hundreds of dollars just a few years ago. While memory still isn't as cheap as disk, its price continues to decline and the price/performance ratio makes sense for some usages, given the performance gain that in-memory DBMSes can achieve with regard to traditional, disk-based DBMSes.
But hardware alone doesn't drive a trend. The need for speed in modern applications is contributing to the rise of the IMDBMS. Organizations are building and adopting more real-time and Web-facing applications that can benefit from the high-end performance that in-memory technology can deliver.
And with today's in-memory processing technology, persistence is no longer a barrier. By its very nature, memory is a volatile form of storage. If the server loses power, the data in memory will be lost. Modern in-memory DBMS offerings have been engineered for the data in memory to persist even after an outage. With stronger reliability and data persistence, in-memory DBMS products have become viable for the transactional and analytical processing requirements of most organizations.
Types of in-memory database systems
At first glance, it might seem easy to define in-memory DBMS -- but in-memory data processing has a long history, with varied approaches.
One of the earliest forms of in-memory data processing was performed by COBOL programmers who created in-memory tables to store data that could be accessed multiple times as a program ran. This, of course, wasn't a database system, but it was an early form of in-memory data access adopted to increase processing speed.
As IT moved into the era of the DBMS, techniques arose for accessing data from memory instead of disk. Any DBA who has used any type of DBMS most likely has tried to get data to be accessed in memory instead of from disk. At the basic level, DBAs must set up appropriate levels of memory to cache data in buffer pools. Caching allows data to remain in memory for subsequent accesses.
Another form of in-memory data processing is to use solid-state disk (SSD). An SSD is a data storage device that relies on memory chips instead of spinning disk to store data persistently. The history of SSDs can be tracked back to the 1950s on large mainframes and supercomputers; in the 1980s, some information management system databases were stored on a type of early SSD. But until recently, the technology was cost-prohibitive for widespread adoption. Today, with memory cheaper than ever, an easy form of achieving in-memory databases can be to simply store database files on SSDs.
But a modern in-memory DBMS is much more than a standard DBMS stored on an SSD. The modern in-memory DBMS is engineered specifically for in-memory processing. All data is stored in memory (dynamic random access memory, or DRAM) on a server, and all operations are performed in memory. All of the data is in memory, and the IMDBMS isn't simply an in-memory cache. The data may be stored in a compressed format to enable more efficient storage and access.
Another popular form of in-memory DBMS enables hybrid in-memory/disk-based databases. A hybrid relies on not only memory chips to store the data, but also hard disk drives. The advantage of a hybrid IMDBMS is flexibility, where databases can be designed with a balance of performance, cost and persistence. Many applications can benefit from some data being rapidly accessible in memory, with other, less-frequently accessed data stored on disk. Disk is still cheaper than memory, so the tradeoffs possible with a hybrid solution appeal to many organizations with mixed requirements and tighter budgets. Most of the leading relational database management system (RDBMS) vendors are adding in-memory database capabilities to complement existing disk-based storage.
IMDBMS products can be relational, NoSQL, NewSQL or any other type of DBMS. They can be used for operational transaction processing or for analytical, business intelligence applications. Of course, each particular product will have different features and capabilities that may enable it to support operational processing better than better than analytical processing (or vice versa).
Pros and cons of in-memory databases
The obvious strength of in-memory DBMSes is the significant performance gains that can be achieved when compared with disk-based alternatives. It isn't unreasonable to expect performance gains from three to four times with in-memory DBMS, and sometimes much more.
In-memory DBMS options have been traditionally strong in the embedded database market, where a small footprint and noninvasive architecture are desirable. But today's in-memory DBMS market place, with many strong enterprise-quality offerings, supports far more than just embedded applications.
If the performance is so much better, why hasn't everybody migrated to an in-memory DBMS? Part of the reason is cost. As mentioned above, memory still costs more than disk, although DRAM chips are getting cheaper every year.
Additional barriers to acceptance include lack of in-memory DBMS expertise, legacy DBMS implementations (some of which are being augmented with in-memory capabilities), and nonstandard options -- sometimes to get the highest speed, you may need to use a different interface than pure SQL.
Finally, database size has traditionally been a constraining factor, but technology advancements are removing this constraint. Even so, although modern IMDBMSes can handle very large databases, many still believe that in-memory databases must be limited in size. More needs to be done to educate the market in order to eliminate this belief. There are examples of IMDBMSes scaling to more than a terabyte.
In-memory database use cases
The use cases for IMDBMSes are wide and varied. Any application that can benefit from a performance boost can potentially profit from using an IMDBMS.
In terms of specific in-memory DBMS uses, applications with real-time data management requirements can benefit, such as apps for telecom and networking, capital markets, defense and intelligence, travel and reservations, call center applications, and gaming.
Applications with an immediate need for data are also candidates, such as apps for real-time business intelligence, fraud detection, real-time analytics and streaming data.
Additional factors for your purchasing assessment
When considering an in-memory DBMS, there are additional considerations that should be factored into your purchasing assessment. Although most IMDBMS products have options to manage the data persistence question, you must pay careful attention to how data durability is handled. Because the data is in memory, which isn't persistent, an IMDBMS must provide a means by which data is hardened to a persistent store. What happens if you pull the plug on the server?
There are various ways for an in-memory DBMS to tackle durability. One option is transaction logging, where periodic snapshots of the in-memory database get written to nonvolatile media. If the system fails and must be restarted, the database can be rolled back or forward to the last completed transaction. Another option is to maintain additional copies of the database, with what's essentially a standby database on nonvolatile media. Yet another option is to utilize nonvolatile RAM (NVRAM), such as battery RAM that's backed up by a battery, or ferroelectric RAM, or FRAM, that can maintain data when the power is turned off. And, of course, hybrid in-memory DBMSes can use disk-based storage for durability.
You should also determine whether in-memory DBMS is required, or whether another technology could be used. For example, if you deploy databases on SSD using your existing RDBMS, can you achieve sufficient performance gains with less disruption to your environment? The in-memory DBMS will outperform a traditional DBMS on SSD because it can eliminate overhead such as cache management -- but you should conduct tests to ensure that the additional gains warrant migrating to a new DBMS technology and vendor.
Additionally, some database appliances are built using in-memory database systems and technology. A database appliance should be a turnkey solution requiring limited -- or no -- setup and installation. But ongoing administration is still required, so make sure you understand the DBMS technology embedded in any appliance you acquire.
Try these five tips for tuning database performance before there's a problem