What is a cloud database? An in-depth cloud DBMS guide
A cloud database is an organized and managed collection of data in an IT system that resides on a public, private or hybrid cloud computing platform. From an overall design and functionality perspective, a cloud database is no different than an on-premises one that runs on an organization's own data center systems. The biggest difference between them lies in how the database is deployed and managed.
For example, the same database appears identical to end users and applications, whether it's on premises or in the cloud. Depending on the particular database software that's used, cloud databases can store structured, unstructured or semistructured data, just as their on-premises counterparts do.
But using a cloud database changes the responsibilities of IT and data management teams. Cloud vendors install and manage the underlying system infrastructure and, in managed services environments, the database platform. That reduces the routine management work traditionally done by IT operations workers and database administrators (DBAs). A DBA can then take on other tasks, such as optimizing databases for applications and tracking the usage and cost of cloud database systems.
Like other IT systems, database deployments are clearly shifting toward the cloud. In a report on cloud databases published in December 2021, Gartner forecasted that they would account for 50% of total database management system (DBMS) revenues worldwide in 2022, a year earlier than it previously predicted. Also, in a survey of 753 cloud users conducted in late 2021 by IT management tools vendor Flexera, 55% said their organizations were using data warehouses in the cloud, while 49% had adopted cloud-based relational database services and 38% were using NoSQL database services.
This comprehensive guide to cloud databases further explains what they are, how they work and their potential IT and business benefits for organizations, as compared with on-premises databases. You'll also find information on cloud database technologies, vendors and security issues, plus more details on database administration responsibilities in the cloud. Throughout the guide, hyperlinks point to related articles that cover those topics and others in more depth.
How cloud databases work
In businesses, databases are used to collect, organize and deliver data to executives and workers for operational and analytics applications. In general, cloud databases provide the same data processing, management and access capabilities as on-premises ones. Existing on-premises databases usually can be migrated to the cloud, along with the applications they support.
Instead of traditional software licenses, pricing is based on the use of system resources, which can be provisioned on demand as needed to meet processing workloads. Alternatively, users can reserve database instances -- typically for at least a year -- to get discounted pricing on regular workloads with consistent capacity requirements.
Organizations that are implementing databases in the public cloud choose between the following two deployment models:
- Self-managed database. This is an infrastructure as a service (IaaS) environment, in which the database runs in a virtual machine on a system operated by a cloud provider. The provider manages and supports the cloud infrastructure, including servers, operating systems and storage devices. But the user organization is responsible for database deployment, administration and maintenance. As a result, it's akin to an on-premises deployment for the DBA, who retains full management control of the database.
- Managed database service. Database as a service (DBaaS) environments are fully managed by the vendor, which could be a cloud platform provider or another database vendor that runs its cloud DBMS on a platform provider's infrastructure. Under the DBaaS model, both the system infrastructure and the database platform are managed for the customer. The DBaaS vendor handles provisioning, backups, scaling, patching, upgrades and other basic database administration functions, while the DBA monitors the database and coordinates with the vendor on some administrative tasks. Similar data warehouse as a service (DWaaS) offerings are also available for deployments of cloud data warehouses.
In addition, some cloud providers -- Amazon Web Services (AWS) and Oracle, for example -- offer versions of their DBaaS technologies for installation in on-premises data centers as part of a private cloud or a hybrid cloud infrastructure that combines public and private clouds. As with a regular DBaaS environment, the provider deploys the databases on its own systems and manages them for customers, except that it instead delivers the systems to a customer's data center to run there and then manages the databases remotely.
Types of cloud databases
A wide variety of cloud databases are available, matching the different types of database technologies that can be deployed on premises. At this point, every notable database vendor offers its software in the cloud. That includes cloud-native databases developed specifically for use in cloud environments and existing on-premises databases that now support the cloud.
The following are the key types of databases that cloud users can take advantage of:
- Relational databases. SQL-based relational software has dominated the database market since the 1990s and remains the most widely used technology, particularly well suited for transaction processing and other applications involving structured data.
- NoSQL databases. NoSQL systems forego the rigid schemas of relational databases, making them a better option for unstructured data. There are four major NoSQL product categories: document databases, graph databases, wide-column stores and key-value databases.
- Multimodel databases. They support more than one data model, enabling them to run a wider set of applications. Many relational and NoSQL databases now qualify as multimodel through add-ons -- for example, the addition of a graph module to a relational DBMS.
- Distributed SQL databases. Initially labeled as NewSQL, these technologies distribute relational databases across multiple computing nodes to create transactional systems that can provide NoSQL-like levels of scalability.
- Cloud data warehouses. Initially developed to provide data warehousing capabilities for business intelligence and reporting applications, they typically now also support data lake development, machine learning and other advanced analytics functions.
Key cloud database management system components
Like other types of DBMS technologies, cloud database platforms include a set of components that work together to process and manage data. The list of key components includes the following items:
- a storage engine that manages data storage;
- a metadata catalog that contains data about database objects;
- a database access language, such as SQL, for querying and modifying data;
- a query optimization engine and a separate query processor;
- a lock manager to control concurrent access to data;
- a log manager to record changes made to the data; and
- a set of database management utilities.
Cloud database benefits
Compared with running databases on premises, cloud databases offer the following potential IT and business advantages to an organization:
- Increased scalability and flexibility. Cloud database systems can be easily scaled up by adding more processing and storage capacity when workloads increase. Some vendors offer autoscaling features that do so dynamically, without users even needing to submit a request. In addition, an organization can quickly deploy new databases and shut down ones that it no longer needs, matching its database strategy to the speed of business.
- Elimination of IT infrastructure. Because the cloud provider is responsible for the system infrastructure in a cloud database environment, an organization may be able to reduce its own IT footprint by decommissioning systems, especially if it's moving on-premises databases to the cloud. At the very least, it can avoid the need to add more systems when it deploys new databases.
- Faster access to new features. With on-premises databases, users typically need to wait for and then install a software upgrade to get new features and functionality. DBaaS vendors can update their cloud databases on an ongoing basis, enabling organizations to take advantage of new features as soon as they're available.
- More reliable systems with guaranteed uptime. Cloud vendors provide high availability, automated backup and disaster recovery capabilities that may be more advanced than what an organization implements itself. They also guarantee uptime percentages as part of their cloud service-level agreement (cloud SLA), giving them an incentive to keep cloud database platforms running smoothly.
- Cost savings. Reduced capital expenditures, data center operating costs and space needs in IT facilities, as well as possible IT staff cuts, can result in lower spending overall. But that isn't a sure thing: Pay-as-you-go cloud services can cost more than planned if resource utilization exceeds expectations or, conversely, if excess capacity goes unnoticed. A cloud database environment needs to be monitored closely to keep cloud costs under control.
On the other hand, on-premises databases may still be best for some organizations, particularly if they want or need to retain full control of the database environment. Get advice on how to decide between cloud and on-premises databases in an article by Chris Foot, a senior strategist and consultant at IT services provider RadixBay.
Migrating databases to the cloud
As mentioned above, migrating on-premises databases to a cloud environment can enable an organization to retire in-house IT systems and gain the other benefits of using cloud databases. Relocating a database to the cloud can also be an effective way to boost processing efficiency and application performance as part of a broader cloud deployment.
But database migration can be a complex process. Before starting one, organizations need to consider various factors and plan a database migration strategy. For example, whether to migrate to a self-managed IaaS environment or a vendor-managed DBaaS one is a fundamental decision. Another is whether to migrate to the cloud version of the current DBMS or a different database technology. Changing databases may have financial or functional benefits, but it could also cause compatibility issues.
Even some related on-premises and cloud database technologies don't fully match up on features. For example, Microsoft's Azure SQL Database relational cloud service shares a common code base with its SQL Server on-premises database, but there are differences between the two products that could require some reengineering of SQL Server databases before they can be migrated to Azure SQL Database. Azure SQL Managed Instance, a version of the cloud software that Microsoft developed to make database migration easier, still isn't 100% compatible with SQL Server.
Leading cloud DBMS vendors
Not surprisingly, the top cloud platform providers -- AWS, Google Cloud, Microsoft and Oracle -- are also the leading database vendors in the cloud, according to Gartner. They all support both IaaS and DBaaS environments on their own platforms and offer different types of cloud databases, including relational, NoSQL, data warehouse and special-purpose ones. For example, AWS offers 16 separate database engines, while Google and Microsoft list 10 and nine, respectively. For more insight on those four vendors, read a comparison of their cloud database offerings.
The following are some other prominent cloud database vendors:
- IBM and SAP, two other major IT vendors that have transitioned from on-premises databases and now offer broad sets of cloud DBMS services;
- NoSQL database vendors Couchbase, DataStax, MongoDB, Neo4j and Redis, among others;
- cloud data warehouse vendors Snowflake and Yellowbrick Data;
- analytics database vendors Exasol and Teradata;
- open source relational database vendors EDB and MariaDB;
- multimodel database vendors InterSystems and MarkLogic; and
- distributed SQL database vendors Cockroach Labs and Yugabyte.
What to evaluate when choosing a cloud database
The database is one of the most important technologies in any IT environment. Here are some of the features and issues organizations should examine when they evaluate cloud databases for planned deployments:
- Performance. As with any type of IT system, this is probably the top factor to consider, especially if the database will be supporting high-performance workloads. Scalability is a critical part of that -- for example, to make sure that real-time processing jobs don't bog down. Performance monitoring and tuning capabilities are another key aspect to look at.
- Cost. The cloud providers offer free online cost calculators that can be used to check different scenarios on pricing models, service configurations, processing regions and other parameters to help balance expected resource needs and the available budget.
- Availability. High availability, disaster recovery and data backup and recovery capabilities should all be assessed, too, along with the cloud vendor's uptime SLA.
- Security. Securing a DBaaS environment isn't solely the vendor's responsibility, but it's crucial to know what it will handle and what security tools and measures it will apply.
Cloud database architecture considerations
The most straightforward approach for deploying cloud databases is to use a single public cloud platform. That ensures consistency on the underlying cloud infrastructure and a single cloud provider to work with, even if multiple DBaaS vendors are involved. But it may not always be feasible or meet an organization's IT and business needs. As a result, other architectural strategies may need to be explored.
One option is deploying databases across a hybrid cloud, putting some of them in a public cloud and others in a private cloud set up on premises.
In an article by technology writer George Lawton on creating a hybrid cloud database environment, Alexander Wurm, a research analyst at advisory services firm Nucleus Research, said using one enables organizations to "reap the benefits of the modern cloud, such as regular updates and elastic scalability, without interfering with the security and reliability of existing on-premises infrastructure in support of mission-critical workloads."
Lawton listed eight items to consider when planning a hybrid cloud database strategy, including data security, data egress charges, potential data latency issues and how to group applications and databases together into logical units to make the deployment process more manageable.
A multi-cloud database architecture is an even more expansive approach that involves the use of multiple public cloud platforms. It can avoid cloud provider lock-in and enable organizations to deploy databases and applications in the cloud platform that's best suited to them. In another article, IT professional Jeff McCormick detailed a set of 10 multi-cloud database management best practices, including the following steps:
- Start with a comprehensive plan and a governance framework.
- Run the right database in the right cloud.
- Use data services that support multi-cloud environments.
- Exploit managed database services, or DBaaS.
- Consider database portability across multiple clouds.
- Reduce the number of different databases.
- Reduce the number of the same databases.
- Optimize data access for applications and end users.
- Keep data local in one cloud platform when possible.
- Connect cloud networks to reduce data latency.
Cloud database security
As mentioned above, cloud database security isn't all on the vendor. What it handles can vary from vendor to vendor. But under the shared responsibility model for cloud security, users need to fully manage database security in IaaS environments, which makes sense, since they deploy and manage the DBMS themselves. DBaaS vendors take on more responsibility for securing the database platform, but DBAs or security teams in organizations are usually still on the hook for things like identity and access management, endpoint security, application security and some aspects of data security.
Read more about the security features in cloud database services from AWS, Microsoft and Google, as detailed by Dave Shackleford, principal consultant at Voodoo Security, who also outlined some database security best practices for user organizations.
Cloud database management roles and responsibilities
Even in a DBaaS environment, DBAs play the lead role in managing an organization's cloud databases. The difference is that the cloud vendor takes over most of the regular, ongoing administration of a database platform. Instead of handling those tasks directly, the DBA can step in when necessary -- for example, to adjust data backup or system maintenance schedules because of application needs, according to RadixBay's Chris Foot.
In an article on how cloud databases change the DBA's role, Foot also cited some new responsibilities. In particular, he wrote that monitoring the usage and cost of cloud database systems is a critical task for a DBA, to help avoid budget overruns and identify required changes in configurations or selected performance levels.
Cloud database market growth and trends
In an April 2022 blog post, Gartner analyst Merv Adrian said 49% of total DBMS revenues in 2021 came from cloud databases, putting them on the doorstep of hitting the 50% level that Gartner forecasted for 2022. "The biggest DBMS market story continues to be the enormous impact of revenue shifting to the cloud," Adrian wrote. He added that the growth of cloud DBMS revenues over the past five years "has been stunning."
The initial push by database vendors to move to the cloud is nearly over, Adrian and four other Gartner analysts wrote in the consulting firm's 2021 Magic Quadrant report on cloud database management systems. Now, technology developments are "more about exploiting the cloud," they said. For example, the analysts predicted that the ability to use metadata to help database users discover and understand data is an area that cloud DBMS vendors will increasingly address.
Freelance technology writer Robert Sheldon and former TechTarget news writer Joel Shore contributed to this article.