Data analytics and AI
This glossary contains definitions related to customer data analytics, predictive analytics, data visualization and operational business intelligence. Some definitions explain the meaning of words used to Hadoop and other software tools used in big data analytics. Other definitions are related to the strategies that business intelligence professionals, data scientists, statisticians and data analysts use to make data-driven decisions.

Algorithms
Terms related to procedures or formulas for solving a problem by conducting a sequence of specified actions. In computing, algorithms in the form of mathematical instructions play an important part in search, artificial intelligence (AI) and machine learning.
-
What is prediction error?
A prediction error is the failure of a model of a system to accurately forecast outcomes.
-
What is HMAC (Hash-Based Message Authentication Code)?
Hash-based message authentication code (HMAC) is a message encryption method that uses a cryptographic key with a hash function.
-
What is binary and how is it used in computing?
Binary describes a numbering scheme in which there are only two possible values for each digit -- 0 or 1 -- and is the basis for all binary code used in computing systems.
Artificial intelligence
Terms related to artificial intelligence (AI), including definitions about machine learning and words and phrases about training data, algorithms, natural language processing, neural networks and automation.
-
What are oversampling and undersampling?
Oversampling and undersampling are techniques used in data analytics and statistics to modify unequal data classes to create balanced data sets.
-
What is pattern recognition?
Pattern recognition is the ability to detect arrangements of characteristics in data that yields information about a given system or data set.
-
What is prediction error?
A prediction error is the failure of a model of a system to accurately forecast outcomes.
Data and data management
Terms related to data, including definitions about data warehousing and words and phrases about data management.
-
What is health informatics?
Health informatics is the practice of applying insight gained from acquiring and analyzing health and biomedical data to help clinicians make better healthcare-related decisions and improve patient care.
-
What is a distributed database?
A distributed database is a database that consists of two or more files located in different sites on the same or different networks.
-
What is YAML (YAML Ain't Markup Language)?
YAML (YAML Ain't Markup Language) is a data serialization language used as the input format for diverse software applications.
Database management
Terms related to databases, including definitions about relational databases and words and phrases about database management.
-
What is a distributed database?
A distributed database is a database that consists of two or more files located in different sites on the same or different networks.
-
What is MySQL?
MySQL is a popular, scalable, user-friendly, open source and free relational database management system (RDBMS) that uses Structured Query Language (SQL) to store, manage, and manipulate data.
-
What is database as a service (DBaaS)?
Database as a service (DBaaS) is a cloud computing managed service offering that provides access to a database without requiring the setup of physical hardware, the installation of software or the need to configure the database.