Data and data management
Terms related to data, including definitions about data warehousing and words and phrases about data management.- security information management (SIM) - Security information management (SIM) is the practice of collecting, monitoring and analyzing security-related data from computer logs and various other data sources.
- self-driving car (autonomous car or driverless car) - A self-driving car -- sometimes called an autonomous car or driverless car -- is a vehicle that uses a combination of sensors, cameras, radar and artificial intelligence (AI) to travel between destinations without a human operator.
- self-service analytics - Self-service analytics is a type of business intelligence (BI) that enables business users to access, manipulate, analyze and visualize data, as well as generate reports based on their discoveries.
- sensitive information - Sensitive information is data that must be protected from unauthorized access to safeguard the privacy or security of an individual or organization.
- SequenceFile - A SequenceFile is a flat, binary file type that serves as a container for data to be used in Hadoop distributed compute projects.
- serverless database - A serverless database is a type of cloud database that is fully managed for an organization by a cloud service provider and runs on demand as needed to support applications.
- SNOMED CT (Systematized Nomenclature of Medicine -- Clinical Terms) - SNOMED CT (Systematized Nomenclature of Medicine -- Clinical Terms) is a standardized, multilingual vocabulary of clinical terminology that is used by physicians and other health care providers for the electronic exchange of health information.
- snowflaking (snowflake schema) - In data warehousing, snowflaking is a form of dimensional modeling in which dimensions are stored in multiple related dimension tables.
- software-defined storage (SDS) - Software-defined storage (SDS) is a software program that manages data storage resources and functionality and has no dependencies on the underlying physical storage hardware.
- spatial data - Spatial data is any type of data that directly or indirectly references a specific geographical area or location.
- standard business reporting (SBR) - Standard business reporting (SBR) is a group of frameworks adopted by governments to promote standardization in reporting business data.
- star schema - A star schema is a database organizational structure optimized for use in a data warehouse or business intelligence that uses a single large fact table to store transactional or measured data, and one or more smaller dimensional tables that store attributes about the data.
- statistical analysis - Statistical analysis is the collection and interpretation of data in order to uncover patterns and trends.
- storage class memory (SCM) - Storage class memory (SCM) is a type of physical computer memory that combines dynamic random access memory (DRAM), NAND flash memory and a power source for data persistence.
- structured data - Structured data is data that has been organized into a formatted repository, typically a database.
- supply chain planning (SCP) - Supply chain planning (SCP) is the process of anticipating the demand for products and planning their materials and components, production, marketing, distribution and sale.
- system of record (SOR) - A system of record (SOR) is an information storage and retrieval system that stores valuable data on an organizational system or process.
- System Restore (Windows) - System Restore is a Microsoft Windows utility designed to protect and revert the operating system (OS) to a previous state.
- T-SQL (Transact-SQL) - T-SQL (Transact-SQL) is a set of programming extensions from Sybase and Microsoft that add several features to the Structured Query Language (SQL), including transaction control, exception and error handling, row processing and declared variables.
- table - A table in computer programming is a data structure used to organize information, just as it is on paper.
- text mining (text analytics) - Text mining is the process of exploring and analyzing large amounts of unstructured text data aided by software that can identify concepts, patterns, topics, keywords and other attributes in the data.
- timeline - A timeline is a visual representation of a chronological sequence of events along a drawn line that helps a viewer understand time relationships.
- transactional data - In computing, transactional data is the information collected from transactions.
- transcription error - A transcription error is a type of data entry error commonly made by human operators or by optical character recognition (OCR) programs.
- transportation management system (TMS) - A transportation management system (TMS) is specialized software for planning, executing and optimizing the shipment of goods.
- tree structure - A tree data structure is an algorithm for placing and locating files (called records or keys) in a database.
- utility storage - Utility storage is a service model in which a provider makes storage capacity available to an individual, organization or business unit on a pay-per-use basis.
- virtual desktop - A virtual desktop is a computer operating system that does not run directly on the endpoint hardware from which a user accesses it.
- volatile memory - Volatile memory is a type of memory that maintains its data only while the device is powered.
- web analytics - Web analytics is the process of analyzing the behavior of visitors to a website by tracking, reviewing and reporting the data generated by their use of the site and its components, such as its webpages, images and videos.
- web services - Web services are a type of internet software that use standardized messaging protocols and are made available from an application service provider's web server for a client or other web-based programs to use.
- WebLogic - Oracle WebLogic Server is a leading e-commerce online transaction processing (OLTP) platform, developed to connect users in distributed computing production environments and to facilitate the integration of mainframe applications with distributed corporate data and applications.
- What are autonomous AI agents and which vendors offer them? - Autonomous artificial intelligence (AI) agents are intelligent systems that can perform tasks for a user or system without human intervention.
- What are data silos and what problems do they cause? - A data silo is a repository of data that's controlled by one department or business unit and isolated from the rest of an organization, much like grass and grain in a farm silo are closed off from outside elements.
- What are graph neural networks (GNNs)? - Graph neural networks (GNNs) are a type of neural network architecture and deep learning method that can help users analyze graphs, enabling them to make predictions based on the data described by a graph's nodes and edges.
- What are knowledge-based systems (KBSes)? - Knowledge-based systems (KBSes) are computer programs that use a centralized repository of data known as a knowledge base to provide problem-solving.
- What are spreadsheets and how do they work? - A spreadsheet is a computer program that can capture, display and manipulate data arranged in rows and columns.
- What is a 3-tier application architecture? - A three-tier application architecture is a modular client-server architecture that consists of a presentation tier, an application tier and a data tier.
- What is a backpropagation algorithm? - A backpropagation algorithm, or backward propagation of errors, is an algorithm that's used to help train neural network models.
- What is a business intelligence dashboard (BI dashboard)? - A business intelligence dashboard, or BI dashboard, is a data visualization and analysis tool that displays on one screen the status of key performance indicators (KPIs) and other important business metrics and data points for an organization, department, team or process.
- What is a chief data officer (CDO)? - A chief data officer (CDO) in many organizations is a C-level executive whose position has evolved into a range of strategic data management responsibilities, including data governance, data quality and data strategy.
- What is a clinical trial? - Clinical trials, also known as clinical research studies, are carefully designed investigations involving volunteer human participants to evaluate the safety, efficacy and outcomes of medical or surgical interventions.
- What is a computer-assisted coding system (CACS)? - A computer-assisted coding system (CACS) is software that analyzes healthcare documents and automatically produces appropriate medical codes, like the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM), ICD-10-CM and the American Medical Association's Current Procedural Terminology (CPT) for specific phrases and terms within the document.
- What is a Consensus Algorithm? - A consensus algorithm is a process in computer science used to achieve agreement on a single data value among distributed processes or systems.
- What is a data architect? - A data architect is an IT professional responsible for defining the policies, procedures, models and technologies used in collecting, organizing, storing and accessing company information.
- What is a data flow diagram (DFD)? - A data flow diagram (DFD) is a graphical or visual representation that uses a standardized set of symbols and notations to describe a business's operations through data movement.
- What is a data governance policy? - A data governance policy is a documented set of guidelines for ensuring an organization's data and information assets are managed consistently and used properly.
- What is a data lake? - A data lake is a storage repository that holds a vast amount of raw data in its native format until it is needed for analytics applications.
- What is a data mart (datamart)? - A data mart is a repository of data that is designed to serve a particular community of knowledge workers.
- What is a distributed database? - A distributed database is a database that consists of two or more files located in different sites on the same or different networks.
- What is a framework? - In general, a framework is a real or conceptual structure intended to serve as a support or guide for the building of something that expands the structure into something useful.
- What is a health information exchange (HIE)? - Health information exchange (HIE) refers to the electronic transmission and exchange of patients' healthcare-related data among healthcare professionals, medical facilities, health information organizations, public health agencies and patients.
- What is a pandemic plan? - A pandemic plan is a documented strategy for how an organization plans to provide essential services when there is a widespread outbreak of an infectious disease.
- What is a pivot table? How to use in Excel and Sheets - A pivot table is a statistics tool that summarizes and reorganizes selected columns and rows of data in a spreadsheet or database table to obtain a desired report.
- What is a private cloud? Definition and examples - Private cloud is a type of cloud computing that delivers advantages similar to public cloud, including scalability and self-service, but through a proprietary architecture.
- What is a records retention schedule? - A records retention schedule is a policy that defines how long paper and electronic content must be kept and provides disposal guidelines for how those items should be discarded.
- What is a registered health information technician (RHIT)? - A registered health information technician (RHIT) is a certified professional who creates and verifies electronic health records.
- What is a semantic network? - A semantic network is a knowledge structure that depicts how concepts are related to one another and how they interconnect.
- What is a stored procedure? - A stored procedure is a set of Structured Query Language (SQL) statements that multiple programs can reuse and share to perform specific tasks.
- What is a support vector machine (SVM)? - A support vector machine (SVM) is a type of supervised learning algorithm used in machine learning to solve classification and regression tasks.
- What is a validation set? How is it different from test, train data sets? - A validation set is a set of data used to train artificial intelligence (AI) with the goal of finding and optimizing the best model to solve a given problem.
- What is a vector database? - A vector database is a type of database technology that's used to store, manage and search vector embeddings, numerical representations of unstructured data that are also referred to simply as vectors.
- What is actionable intelligence? - Actionable intelligence is information that can be immediately used or acted upon, either tactically in direct response to an evolving situation, or strategically as the result of data analytics or some other assessment.
- What is AI ethics? - AI ethics is a system of moral principles and techniques intended to inform the development and responsible use of artificial intelligence technology.
- What is Allscripts? - Allscripts is a former vendor of electronic health record (EHR) systems and healthcare IT solutions, primarily serving physician practices, hospitals, and healthcare systems.
- What is an analytics database (analytical database)? - An analytics database, also called an analytical database, is a read-only system that stores historical data on business metrics such as sales performance and inventory levels.
- What is an API endpoint? - An API endpoint is a point at which an application programming interface -- the code that enables two software programs to communicate with each other -- connects with the software program.
- What is an enterprise master patient index (EMPI)? - An enterprise master patient index (EMPI) is a database that is used to maintain consistent and accurate information about each patient registered by a healthcare organization across its various departments.
- What is an entity relationship diagram (ERD)? - An entity relationship diagram (ERD), also known as an entity relationship model, is a graphical representation that depicts relationships among people, objects, places, concepts or events in an information technology (IT) system.
- What is an inductive argument? - An inductive argument is an assertion that uses specific premises or observations to make a broader generalization.
- What is an information system (IS)? - An information system (IS) is an interconnected set of components used to collect, store, process and transmit data and digital information.
- What is an NVDIMM (non-volatile dual in-line memory module)? - An NVDIMM (non-volatile dual in-line memory module) is hybrid computer memory that retains data during a service outage.
- What is anomaly detection? An overview and explanation - Anomaly detection is the process of identifying data points, entities or events that fall outside the normal range.
- What is artificial intelligence as a service (AIaaS)? - Artificial intelligence as a service (AIaaS) is a cloud-based service that enables organizations to access artificial intelligence (AI) through a third-party offering.
- What is big data analytics? - Big data analytics is the process of examining big data to uncover information -- such as hidden patterns, correlations, market trends and customer preferences -- that can help organizations make informed business decisions.
- What is bit rot? - Bit rot is the slow deterioration in the performance and integrity of data stored on storage media.
- What is Cerner Corp.? - Cerner Corp.
- What is Change Healthcare? - Change Healthcare is a healthcare technology provider specializing in revenue cycle management, payment management and health information exchange solutions.
- What is clinical informatics? - Clinical informatics is a specialized field of study in healthcare that focuses on using information technology and data analytics to improve patient care and administrative workflows in clinical settings.
- What is continuous monitoring? - Continuous monitoring constantly observes the performance and operation of IT assets to help reduce risk and improve uptime instead of taking a point-in-time snapshot of a device, network or application.
- What is corporate performance management (CPM)? - Corporate performance management (CPM) encompasses the processes and methodologies used to align an organization's strategies and goals to its plans and actions as a business.
- What is Current Procedural Terminology (CPT) code? - Current Procedural Terminology (CPT) is a medical code set that enables physicians and other healthcare providers to describe and report the medical, surgical, and diagnostic procedures and services they perform to government and private payers, researchers and other interested parties.
- What is customer data integration (CDI)? - Customer data integration (CDI) is the process of defining, consolidating and managing customer information across an organization's business units and systems to achieve a "single version of the truth" for customer data.
- What is customer intelligence (CI) and how does it help business? - Customer intelligence (CI) is the process of collecting and analyzing detailed customer data from internal and external sources to gain insights about customer needs, motivations and behaviors.
- What is customer segmentation? - Customer segmentation is the practice of dividing a customer base into groups of individuals that have similar characteristics relevant to marketing, such as age, gender, interests and spending habits.
- What is dark data? - Dark data is digital information an organization collects, processes and stores that is not currently being used for business purposes.
- What is data activation? - Data activation is a marketing approach that uses consumer information and data analytics to help companies gain real-time insight into target audience behavior and plan for future marketing initiatives.
- What is data aggregation? - Data aggregation is any process whereby data is gathered and expressed in a summary form.
- What is data analytics (DA)? - Data analytics (DA) is the process of examining data sets to find trends and draw conclusions about the information they contain.
- What is data architecture? A data management blueprint - Data architecture is a discipline that documents an organization's data assets, maps how data flows through IT systems and provides a blueprint for managing data, as this guide explains.
- What is data as a service (DaaS)? - Data as a service (DaaS) is an information provision and distribution model in which data files -- including text, images, sounds and videos -- are made available to customers over a network, typically the internet.
- What is data automation? - Data automation is the use of software tools and infrastructure to streamline data management tasks.
- What is data cleansing (data cleaning, data scrubbing)? - Data cleansing, also referred to as data cleaning or data scrubbing, is the process of fixing incorrect, incomplete, duplicate or otherwise erroneous data in a data set.
- What is data curation? - Data curation is the process of creating, organizing and maintaining data sets so people looking for information can access and use them.
- What is data democratization? - Data democratization makes information in a digital format accessible to the average end user.
- What is data egress? How it works and how to manage costs - Data egress is when data leaves a closed or private network and is transferred to an external location.
- What is data governance and why does it matter? - Data governance is the process of managing the availability, usability, integrity and security of the data in enterprise systems, based on internal standards and policies that also control data usage.
- What is data in use? - Data in use is data that is currently being updated, processed, erased, accessed or read by a system, application, user or device.
- What is data labeling? - Data labeling is the process of identifying and tagging data samples commonly used in the context of training machine learning (ML) models.
- What is data lifecycle? - A data lifecycle is the sequence of stages that a unit of data goes through from its initial generation or capture to its archiving or deletion at the end of its useful life.