Data and data management
Terms related to data, including definitions about data warehousing and words and phrases about data management.DAT - HAD
- data - In computing, data is information that has been translated into a form that is efficient for movement or processing.
- data abstraction - Data abstraction is the reduction of a particular body of data to a simplified representation of the whole.
- data activation - Data activation is a marketing approach that uses consumer information and data analytics to help companies gain real-time insight into target audience behavior and plan for future marketing initiatives.
- data aggregation - Data aggregation is any process whereby data is gathered and expressed in a summary form.
- data analytics (DA) - Data analytics (DA) is the process of examining data sets to find trends and draw conclusions about the information they contain.
- data architect - A data architect is an IT professional responsible for defining the policies, procedures, models and technologies to be used in collecting, organizing, storing and accessing company information.
- Data as a Service (DaaS) - Data as a Service (DaaS) is an information provision and distribution model in which data files (including text, images, sounds, and videos) are made available to customers over a network, typically the Internet.
- data availability - Data availability is a term used by computer storage manufacturers and storage service providers to describe how data should be available at a required level of performance in situations ranging from normal through disastrous.
- data breach - A data breach is a cyber attack in which sensitive, confidential or otherwise protected data has been accessed or disclosed in an unauthorized fashion.
- data catalog - A data catalog is a software application that creates an inventory of an organization's data assets to help data professionals and business users find relevant data for analytics uses.
- data center chiller - A data center chiller is a cooling system used in a data center to remove heat from one element and deposit it into another element.
- data center services - Data center services is a collective term for all the supporting components necessary to the proper operation of data center.
- data citizen - A data citizen is an employee who relies on data to make decisions and perform job responsibilities.
- data classification - Data classification is the process of organizing data into categories that make it is easy to retrieve, sort and store for future use.
- data clean room - A data clean room is a technology service that helps content platforms keep first person user data private when interacting with advertising providers.
- data cleansing (data cleaning, data scrubbing) - Data cleansing, also referred to as data cleaning or data scrubbing, is the process of fixing incorrect, incomplete, duplicate or otherwise erroneous data in a data set.
- data collection - Data collection is the process of gathering data for use in business decision-making, strategic planning, research and other purposes.
- data context - Data context is the network of connections among data points.
- data curation - Data curation is the process of creating, organizing and maintaining data sets so they can be accessed and used by people looking for information.
- data democratization - Data democratization is the ability for information in a digital format to be accessible to the average end user.
- data destruction - Data destruction is the process of destroying data stored on tapes, hard disks and other forms of electronic media so that it is completely unreadable and cannot be accessed or used for unauthorized purposes.
- data dignity - Data dignity, also known as data as labor, is a theory positing that people should be compensated for the data they have created.
- Data Dredging (data fishing) - Data dredging -- sometimes referred to as data fishing -- is a data mining practice in which large data volumes are analyzed to find any possible relationships between them.
- data engineer - A data engineer is an IT worker whose primary job is to prepare data for analytical or operational uses.
- data exhaust - Data exhaust is a byproduct of user actions online and consists of the various files generated by web browsers and their plug-ins such as cookies, log files, temporary internet files and and .
- data exploration - Data exploration is the first step in data analysis involving the use of data visualization tools and statistical techniques to uncover data set characteristics and initial patterns.
- data feed - A data feed is an ongoing stream of structured data that provides users with updates of current information from one or more sources.
- data flow diagram (DFD) - A data flow diagram (DFD) is a graphical or visual representation using a standardized set of symbols and notations to describe a business's operations through data movement.
- data governance policy - A data governance policy is a documented set of guidelines for ensuring that an organization's data and information assets are managed consistently and used properly.
- data gravity - Data gravity is an attribute of data that is manifest in the way software and services are drawn to it relative to its mass (the amount of data).
- data historian - A data historian is a software program that records the data created by processes running in a computer system.
- data in motion - Data in motion, also referred to as data in transit or data in flight, is a process in which digital information is transported between locations either within or between computer systems.
- data in use - Data in use is data that is currently being updated, processed, accessed and read by a system.
- data ingestion - Data ingestion is the process of obtaining and importing data for immediate use or storage in a database.
- data integration - Data integration is the process of combining data from multiple source systems to create unified sets of information for both operational and analytical uses.
- data integrity - Data integrity is the assurance that digital information is uncorrupted and can only be accessed or modified by those authorized to do so.
- data labeling - Data labeling is the process of identifying and tagging data samples commonly used in the context of training machine learning (ML) models.
- data lake - A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed for analytics applications.
- data lakehouse - A data lakehouse is a data management architecture that combines the key features and the benefits of a data lake and a data warehouse.
- data lifecycle - A data lifecycle is the sequence of stages that a particular unit of data goes through from its initial generation or capture to its eventual archival and/or deletion at the end of its useful life.
- data lifecycle management (DLM) - Data lifecycle management (DLM) is a policy-based approach to managing the flow of an information system's data throughout its lifecycle: from creation and initial storage to when it becomes obsolete and is deleted.
- data literacy - Data literacy is the ability to derive information from data, just as literacy in general is the ability to derive information from the written word.
- data loss - Data loss is the intentional or unintentional destruction of information, caused by people and or processes from within or outside of an organization.
- data loss prevention (DLP) - Data loss prevention (DLP) -- sometimes referred to as data leak prevention, information loss prevention and extrusion prevention -- is a strategy to mitigate threats to critical data.
- data management as a service (DMaaS) - Data management as a service (DMaaS) is a type of cloud service that provides enterprises with centralized storage for disparate data sources.
- data management platform (DMP) - A data management platform (DMP), also referred to as a unified data management platform (UDMP), is a centralized system for collecting and analyzing large sets of data originating from disparate sources.
- data marketplace (data market) - Data marketplaces typically offer various types of data for different markets and from different sources.
- data mart (datamart) - A data mart is a repository of data that is designed to serve a particular community of knowledge workers.
- data masking - Data masking is a method of creating a structurally similar but inauthentic version of an organization's data that can be used for purposes such as software testing and user training.
- data mesh - Data mesh is a decentralized data management architecture for analytics and data science.
- data migration - Data migration is the process of transferring data between data storage systems, data formats or computer systems.
- data mining - Data mining is the process of sorting through large data sets to identify patterns and relationships that can help solve business problems through data analysis.
- data modeling - Data modeling is the process of creating a simplified diagram of a software system and the data elements it contains, using text and symbols to represent the data and how it flows.
- data observability - Data observability is a process and set of practices that aim to help data teams understand the overall health of the data in their organization's IT systems.
- data pipeline - A data pipeline is a set of network connections and processing steps that moves data from a source system to a target location and transforms it for planned business uses.
- data portability - Data portability is the ability to move data among different application programs, computing environments or cloud services.
- data preprocessing - Data preprocessing, a component of data preparation, describes any type of processing performed on raw data to prepare it for another data processing procedure.
- data profiling - Data profiling refers to the process of examining, analyzing, reviewing and summarizing data sets to gain insight into the quality of data.
- data protection management (DPM) - Data protection management (DPM) comprises the administration, monitoring and management of backup processes to ensure backup tasks run on schedule and data is securely backed up and recoverable.
- data quality - Data quality is a measure of the condition of data based on factors such as accuracy, completeness, consistency, reliability and whether it's up to date.
- data residency - Data residency refers to the physical or geographic location of an organization's data or information.
- data retention policy - A data retention policy, or records retention policy, is an organization's established protocol for retaining information for operational or regulatory compliance needs.
- data science as a service (DSaaS) - Data science as a service (DSaaS) is a form of outsourcing that involves the delivery of information gleaned from advanced analytics applications run by data scientists at an outside company to corporate clients for their business use.
- data scientist - A data scientist is an analytics professional who is responsible for collecting, analyzing and interpreting data to help drive decision-making in an organization.
- data set - A data set is a collection of data that contains individual data units organized (formatted) in a specific way and accessed by one or more specific access methods based on the data set organization and data structure.
- data silo - A data silo exists when an organization's departments and systems cannot, or do not, communicate freely with one another and encourage the sharing of business-relevant data.
- data source name (DSN) - A data source name (DSN) is a data structure containing information about a specific database to which an Open Database Connectivity (ODBC) driver needs to connect.
- data splitting - Data splitting is when data is divided into two or more subsets.
- data stewardship - Data stewardship is the management and oversight of an organization's data assets to help provide business users with high-quality data that is easily accessible in a consistent manner.
- data storytelling - Data storytelling is the process of translating data analyses into understandable terms in order to influence a business decision or action.
- data streaming - Data streaming is the continuous transfer of data from one or more sources at a steady, high speed for processing into specific outputs.
- data structures - A data structure is a specialized format for organizing, processing, retrieving and storing data.
- Data Transfer Project (DTP) - Data Transfer Project (DTP) is an open source initiative to facilitate customer-controlled data transfers between two online services.
- data transformation - Data transformation is the process of converting data from one format, such as a database file, XML document or Excel spreadsheet, into another.
- data validation - Data validation is the practice of checking the integrity, accuracy and structure of data before it is used for a business operation.
- data virtualization - Data virtualization is an umbrella term used to describe an approach to data management that allows an application to retrieve and manipulate data without requiring technical details about the data.
- data warehouse - A data warehouse is a repository of data from an organization's operational systems and other sources that supports analytics applications to help drive business decision-making.
- data warehouse appliance - A data warehouse appliance is an all-in-one “black box” solution optimized for data warehousing.
- data warehouse as a service (DWaaS) - Data warehouse as a service (DWaaS) is an outsourcing model in which a cloud service provider configures and manages the hardware and software resources a data warehouse requires, and the customer provides the data and pays for the managed service.
- data-driven decision management (DDDM) - Data-driven decision management (DDDM) is an approach to business governance that values actions that can be backed up with verifiable data.
- database (DB) - A database is a collection of information that is organized so that it can be easily accessed, managed and updated.
- database management system (DBMS) - A database management system (DBMS) is system software for creating and managing databases, allowing end users to create, protect, read, update and delete data in a database.
- database marketing - Database marketing is a systematic approach to the gathering, consolidation and processing of consumer data.
- database normalization - Database normalization is intrinsic to most relational database schemes.
- database replication - Database replication is the frequent electronic copying of data from a database in one computer or server to a database in another -- so that all users share the same level of information.
- DataOps - DataOps is an Agile approach to designing, implementing and maintaining a distributed data architecture that will support a wide range of open source tools and frameworks in production.
- Db2 - Db2 is a family of database management system (DBMS) products from IBM that serve a number of different operating system (OS) platforms.
- deep analytics - Deep analytics is the application of sophisticated data processing techniques to yield information from large and typically multi-source data sets comprised of both unstructured and semi-structured data.
- demand planning - Demand planning is the process of forecasting the demand for a product or service so it can be produced and delivered more efficiently and to the satisfaction of customers.
- denormalization - Denormalization is the process of adding precomputed redundant data to an otherwise normalized relational database to improve read performance of the database.
- descriptive analytics - Descriptive analytics is a type of data analytics that looks at past data to give an account of what has happened.
- deterministic/probabilistic data - Deterministic and probabilistic are opposing terms that can be used to describe customer data and how it is collected.
- digital wallet - In general, a digital wallet is a software application, usually for a smartphone, that serves as an electronic version of a physical wallet.
- dimension - In data warehousing, a dimension is a collection of reference information that supports a measurable event, such as a customer transaction.
- dimension table - In data warehousing, a dimension table is a database table that stores attributes describing the facts in a fact table.
- dimensionality reduction - Dimensionality reduction is a process and technique to reduce the number of dimensions -- or features -- in a data set.
- disambiguation - Disambiguation is the process of determining a word's meaning -- or sense -- within its specific context.
- disaster recovery (DR) - Disaster recovery (DR) is an organization's ability to respond to and recover from an event that affects business operations.
- distributed database - A distributed database is a database that consists of two or more files located in different sites either on the same network or on entirely different networks.
- distributed ledger technology (DLT) - Distributed ledger technology (DLT) is a digital system for recording the transaction of assets in which the transactions and their details are recorded in multiple places at the same time.