What is an analytic database?
An analytic database, also called an analytical database, is a read-only system that stores historical data on business metrics such as sales performance and inventory levels. Business analysts, corporate executives and other workers run queries and reports against an analytic database. The information is regularly updated to include recent transaction data from an organization's operational systems.
An analytic database is specifically designed to support business intelligence (BI) and analytic applications, typically as part of a data warehouse or data mart. This differentiates it from an operational, transactional or online transaction processing database, which is used for transaction processing, such as order entry and other business applications.
While databases that do transaction processing can also support data warehouses and BI applications, analytic database vendors claim their products offer performance and scalability advantages over conventional relational database software.
Components and types of analytic databases
Analytical databases store massive amounts of data that organizations use to gain insight into their business, customers and more. The data stored in analytical databases comes from sources such as enterprise resource planning (ERP), customer relationship management (CRM) and other business applications or proprietary data. Some analytical databases include a data warehouse, which is designed to be flexible for users who need to create specific reports and queries.
Two major components of analytical databases are the data model and the query language. The data model is the structure through which data is stored. Data models can be relational -- where data is stored in separate tables and joined together when necessary -- or object-oriented -- where data is stored in fields of objects and represented as objects in software.
A query language is a standardized and interpreted programming language for the retrieval of information from a database management system (DBMS) or database. The information is usually, but not always, in the form of a table or a set of tables. A query language enables users to retrieve information from a database without knowing the internal structure of the database. The most well-known query language is Structured Query Language (SQL). It is used to access and manipulate data in relational databases. SQL is a declarative programming language -- as opposed to a procedural one -- meaning that its syntax is defined by the data it operates on rather than the steps to manipulate the data.
There are five main types of analytic databases:
- Columnar databases. A columnar database stores data in large contiguous blocks of memory called data columns. This differs from a row-oriented database, where data is stored in tables split across columns and rows. It is the use of columns that makes columnar databases well suited to analytical processing, and able to meet the requirements of data warehouses. For example, columnar databases provide a solution to the problem of data sprawl. As data volumes grow, the value of high-speed columnar database technology scales. Because columnar databases are designed to store, analyze and retrieve data by column, they can accelerate data loading and improve data analytics performance. The columnar architecture of this database also allows for efficient data compression and fast aggregate queries. Examples of columnar databases include Amazon Redshift, MariaDB and Apache Kudu.
- Massively parallel processing (MPP) databases. In an MPP database, data is stored on multiple servers rather than on a single server. Each server has its own set of data and its own computing power. MPP databases typically provide high availability and redundancy by using clusters of computers to serve multiple users. They also efficiently handle large amounts of data while providing fast access to that data. Examples of MPP databases include Vertica and Teradata.
- In-memory databases. In-memory databases are optimized to store data in random-access memory. This contrasts with traditional databases that rely on hard drives to store both data and program instructions. Because data is stored in memory, in-memory databases provide a near-real-time response. Storing data in memory offers several potential benefits over traditional databases, including faster query processing, less I/O overhead and the ability to handle larger databases. Examples of in-memory databases include SAP HANA, Oracle TimesTen and Aerospike.
- Online analytical processing (OLAP) databases. An OLAP database is an online database that is used to create reports and analyses that are faster and more efficient than traditional databases. An OLAP database stores multidimensional cubes of aggregated data for analyzing information based on multiple data attributes. The data is stored in a structure designed to facilitate analysis of the data. Because OLAP databases are stored in a multidimensional format, they can handle a high volume of data better than a traditional database. Examples of database structures include star schemas and snowflake schemas. OLAP databases are used in several industries, including financial services and retail. Examples of OLAP databases include Snowflake and IBM Cognos.
- Data warehouse appliances. Data warehouse appliances are integrated hardware and software tools designed to optimize the performance of data warehouses. A data warehouse is an information repository large organizations use to centralize, store and index vast amounts of data from disparate sources. Data warehouse appliances simplify and expedite data warehouse implementation and management. They are commonly used in BI to track and analyze data from many sources, including ERP systems, CRM software and other databases. Numerous vendors offer data warehouse appliances, including IBM, Microsoft, Oracle and Teradata.
Thinking of migrating to an open source database? Learn how in this article which details the steps and considerations for making the move.