Data is the lifeblood of organizations, and the database management system is the beating heart of most operational and analytical business systems. The DBMS is the primary platform for processing, storing and managing data and serving it to applications and end users. But there are many different types of DBMS products, each with its own strengths and weaknesses.
The most prevalent type is the relational database management system (RDBMS). It became the norm for data management more than 30 years ago, after low-cost servers became powerful enough to make the technology widely practical and relatively affordable. Relational databases use the SQL programming language and are based on a data model that supports transactional consistency and reliability, making them a good fit for the traditional structured data that's common in business applications.
But some shortcomings with the relational model -- in particular, its rigidity and cost -- became more apparent in the web era and were brought to the fore by the emergence of big data technologies. Today, IT departments trying to process unstructured and semistructured data or data sets with variable structures may want to consider NoSQL database technologies as an alternative to the RDBMS.
Applications that require high-speed transactions and rapid response rates, or that involve complex analytics done in real or near real time, can benefit from in-memory databases. NewSQL databases, now often referred to as distributed SQL ones, blend elements of the relational and NoSQL approaches to address the need for more scalability in SQL-based applications. It also may make sense to combine multiple database technologies for some processing needs -- and that can be done in a single platform if you use a multimodel DBMS. Other special-purpose database systems are also available.
DBMS categories and technologies
There's a lot at stake in the DBMS evaluation and selection process. Choosing a database technology can affect the success or failure of your IT systems and applications. Because the database landscape is complex and confusing to navigate, it's important to fully understand the different types of DBMSes, along with when and why to use them. That applies both to on-premises DBMS platforms and the cloud databases that organizations are increasingly using.
To help with that, let's look more closely at the available options.
Until relatively recently, the RDBMS was the only product category worth considering for most applications. The big data trend, as well as the growth of technologies like IoT and real-time data streaming, have led to the development of new kinds of DBMS products that compete well with relational software for certain use cases. However, the RDBMS continues to be the undisputed market leader in terms of revenue and installed base.
Based on the mathematics of set theory and first developed for commercial release in the late 1970s, relational databases provide data management, access and protection with reasonable performance for most applications, whether operational or analytical in nature.
Since the mid-1990s, relational software has been the primary operational DBMS, led by products from industry giants: Oracle Database, Microsoft SQL Server and IBM Db2, plus SAP HANA in recent years. As more database workloads move to the cloud, AWS and Google Cloud have also become factors in the RDBMS market, partly by offering cloud-based versions of MySQL, PostgreSQL and MariaDB -- three popular open source DBMSes.
Relational systems also became the most widely used DBMS platforms for the data warehouses that organizations began to deploy in the 1990s to support BI and enterprise reporting applications. In addition to Oracle, Microsoft and IBM, vendors like Teradata and SAP offered relational databases for data warehousing. In the cloud, they've been joined by RDBMS technologies such as Amazon Redshift from AWS, Google BigQuery and Snowflake, although those products use columnar storage to optimize analytical query performance.
The RDBMS is adaptable, stable and reliable. Its maturity has been bolstered by all the years of use in both large and small organizations. Support for the ACID properties -- atomicity, consistency, isolation and durability -- is one of the most compelling features of relational databases. ACID compliance guarantees that all transactions are completed correctly or that a database is returned to its previous state if a transaction fails to go through -- a key capability to ensure that data is always consistent.
Given the comprehensive nature of the RDBMS, why are other types of database systems gaining popularity? Web-scale data processing and big data requirements challenge the capabilities of relational databases. Although they can be used in these realms, alternatives that offer more flexible database schemas, less stringent consistency models and lower processing overhead can be advantageous in such dynamic environments. Also, the stability and reliability of RDBMS products comes at a cost: They aren't cheap.
2. NoSQL DBMS
NoSQL systems began to appear in the mid-2000s. At first meant to be taken literally, NoSQL came to more commonly stand for "not only SQL," as many NoSQL vendors adopted some aspects of the programming language. However, while the SQL-based RDBMS requires a rigidly defined schema, NoSQL databases give users more flexibility, supporting schemas in which all of the different entities don't need to contain the same data elements. For loosely defined or varied data structures that may also evolve over time, a NoSQL DBMS can be a more practical solution than an RDBMS.
Another difference between NoSQL and relational DBMSes is how data consistency is provided. Most NoSQL DBMS products support eventual consistency, in which data may not always be consistent across the nodes of a distributed database but becomes so when it's not being actively updated. RDBMS platforms typically also offer varying levels of locking, consistency and isolation that can be used to implement eventual consistency, and some NoSQL vendors have added full ACID compliance. In general, though, NoSQL systems offer a more relaxed form of consistency, which can speed up processing.
Because of those attributes, NoSQL addresses some of the problems that RDBMS technologies encounter in working with unstructured data, varied data sets and large amounts of sparse data. Data is classified as sparse when not every element in a database is populated and there is a lot of "empty space" between the values that are there. For example, think of a matrix that contains numerous zeroes and only a few actual values.
But while certain types of data and use cases can benefit from NoSQL software, using it can come at the price of eliminating transactional integrity, flexible indexing and ease of querying. Further complicating matters is the fact that NoSQL itself includes multiple types of DBMS platforms -- it's a broad descriptor for these four primary product categories:
- Key-value stores store pairs of unique keys and associated values. Examples include Aerospike, Amazon DynamoDB, Berkeley DB, Redis and Riak.
- Document databases store data in document-like structures encoded in formats such as JSON and XML. Examples include Couchbase Server, CouchDB, MarkLogic Server and MongoDB.
- Wide-column stores store data in tables that contain large numbers of columns. Examples include Accumulo, Cassandra, Google Cloud Bigtable, HBase and ScyllaDB.
- Graph databases store data in graph form to highlight the connections between different data elements. Examples include AllegroGraph, Amazon Neptune, ArangoDB, Neo4j and TigerGraph.
Each type of NoSQL DBMS is best suited to particular use cases and has individual pluses and minuses to consider. A thorough technology evaluation requires in-depth knowledge of the different NoSQL product categories, along with a solid understanding of the data and application needs that the chosen DBMS will have to support.
3. In-memory DBMS
DBMS technologies also include the in-memory DBMS (IMDBMS), sometimes referred to as a main memory DBMS. An IMDBMS relies mostly on memory to store data, as opposed to using disk-based storage. That makes the data in a database more immediately accessible to end users.
The primary use case for in-memory databases is to improve performance in applications that require fast data throughput. Because data is maintained in memory, I/O latency is greatly reduced by eliminating mechanical disk movement, seek time and the transfer of data to a buffer. IMDBMS products can also reduce processing overhead because the internal algorithms they run usually are simpler to execute, with fewer CPU instructions than the ones in disk-based systems.
In-memory databases aren't a wholly distinct product category, though. SAP HANA is an in-memory RDBMS, as are technologies like Oracle TimesTen In-Memory Database, Volt Active Data and SingleStore, while Aerospike and Redis are examples of in-memory NoSQL DBMSes. In addition, Oracle, Microsoft and IBM have all added in-memory processing capabilities to their flagship RDBMS platforms.
4. Multimodel DBMS
A growing product category is the multimodel DBMS, which supports more than one type of data model. Many NoSQL offerings do so -- for example, combining document and key-value stores. Likewise, some RDBMS platforms have also evolved to support NoSQL data models, such as adding document and graph stores to their core relational engine.
5. NewSQL DBMS
NewSQL databases are a somewhat informal DBMS category. They're RDBMS platforms developed to bridge the gap with NoSQL systems by supporting distributed databases with ACID-compliant capabilities for horizontal scaling, real-time processing and other high-volume data needs. Examples include ClustrixDB, CockroachDB, Google Cloud Spanner, NuoDB, Volt Active Data and YugabyteDB. Some vendors of such technologies now eschew NewSQL as a label and describe their products as distributed SQL DBMSes.
Other DBMS categories that aren't as prevalent as the ones above include the following:
- Time series databases collect data generated on ongoing basis and store it in successive order to enable analysis of how the data changes over time.
- Search databases are specialized data stores that are designed to support enterprise search applications, as well as application monitoring and other uses.
- XML DBMSes are architected to support XML data. However, most RDBMS platforms now provide XML support, as do NoSQL document databases.
- A columnar DBMS is a SQL database system tailored to BI and data warehousing because it's optimized for reading a few columns that contain many rows at once in order to speed up queries. While traditional columnar technologies weren't as widely used as relational databases in on-premises data warehouses, the use of columnar storage in the popular cloud data warehouse platforms mentioned above has expanded its adoption.
- Popular in the 1990s, object-oriented DBMSes represent data as objects and were designed to work with object-oriented programming languages. Hybrid object-relational databases that blended those two approaches were also developed.
- Pre-relational DBMS products include hierarchical and network database systems developed to run on mainframes. Two such technologies are IBM Information Management System -- better known as IMS -- and IDMS, which are both still available.
Additional considerations for choosing a DBMS
As you examine the different types of database technologies and then specific DBMS products for a planned purchase, one issue that should be at the top of the list to consider is hardware platform support. The predominant computing environments today are Linux, Windows, Unix and the mainframe. Not every DBMS is supported on each of those platforms.
Another consideration is technical support from DBMS vendors. Software maintenance and support is a critical capability -- and it can be a significant ongoing expense. Many DBMS technologies are open source, particularly NoSQL ones. The open source approach increases flexibility and reduces the initial cost. However, open source software lacks support unless you buy a commercial subscription. Total cost of ownership can end up being higher when you factor in the related administration and support costs.
You might also choose to reduce implementation and administration pain by deploying a DBMS in the cloud. You can either deploy a self-managed system in a private or public cloud or use a managed cloud database service offered by a cloud provider or other database vendor. Also known as database as a service (DBaaS), the latter approach frees users from having to install, configure and administer DBMS products themselves. Instead, the vendor handles those tasks as part of the DBaaS environment. Overall, database deployments are clearly shifting toward the cloud: In an April 2022 blog post, Gartner analyst Merv Adrian said cloud databases accounted for 49% of worldwide DBMS revenues in 2021 and wrote that their growth "has been stunning."
If your organization is considering a DBMS purchase, it's important to document your specific needs, determine which DBMS category is the right fit and then examine the leading products in that category. Doing so will require additional details on the different types of DBMS software and the use cases for which each technology is optimized. Indeed, there are many variables that need to be evaluated to ensure you make a wise decision when procuring a database management system.