Get started
Bring yourself up to speed with our introductory content.
Get started
Bring yourself up to speed with our introductory content.
9 metadata management standards examples that guide success
Organizations looking to implement metadata management can choose from existing standards that support archiving, sciences, finance and other kinds of digital resources. Continue Reading
What is corporate performance management (CPM)?
Corporate performance management (CPM) encompasses the processes and methodologies used to align an organization's strategies and goals to its plans and actions as a business. Continue Reading
What is NoSQL (Not Only SQL database)?
NoSQL is an approach to database management that can accommodate a wide variety of data models, including key-value, document, columnar and graph formats. Continue Reading
-
What is a data fabric?
A data fabric is an architecture and software offering a unified collection of data assets, databases and database architectures within an enterprise. Continue Reading
5 data governance framework examples
Data governance isn't plug and play: Organizations must select which data governance framework best fits their business goals and needs. Continue Reading
-
Definitions to Get Started
- What is corporate performance management (CPM)?
- What is NoSQL (Not Only SQL database)?
- What is a data fabric?
- What is Structured Query Language (SQL)?
- What is data validation?
- What is a data flow diagram (DDF)?
- What is denormalization and how does it work?
- What are data silos and what problems do they cause?
Successful data operations follow a data governance roadmap
Implementing a data governance strategy requires a roadmap to keep everyone on track and overcome challenges. Follow eight key steps for best results.Continue Reading
What is Structured Query Language (SQL)?
Structured Query Language (SQL) is a standardized programming language that is used to manage relational databases and perform various operations on the data in them.Continue Reading
What is data validation?
Data validation is the practice of checking the integrity, accuracy and structure of data before it is used for or by one or more business operations.Continue Reading
How to use Microsoft Copilot in Power BI
Enable Microsoft Copilot in Power BI to automate key features using generative AI capabilities that improve insights and accelerate decision-making.Continue Reading
Use RAG with LLMs to democratize data analytics
Pairing retrieval-augmented generation with an LLM helps improve prompts and outputs, democratizing data access and making previously elusive information available to more users.Continue Reading
-
What is a data flow diagram (DDF)?
A data flow diagram (DFD) is a graphical or visual representation that uses a standardized set of symbols and notations to describe a business's operations through data movement.Continue Reading
What is denormalization and how does it work?
Denormalization is the process of adding precomputed redundant data to an otherwise normalized relational database to improve read performance.Continue Reading
What are data silos and what problems do they cause?
A data silo is a repository of data that's controlled by one department or business unit and isolated from the rest of an organization, much like grass and grain in a farm silo are closed off from outside elements.Continue Reading
What is a data architect?
A data architect is an IT professional responsible for defining the policies, procedures, models and technologies used in collecting, organizing, storing and accessing company information.Continue Reading
What is a vector database?
A vector database is a type of database technology that's used to store, manage and search vector embeddings, numerical representations of unstructured data that are also referred to simply as vectors.Continue Reading
7 data modeling techniques and concepts for business
Three types of data models and various data modeling techniques are available to data management teams to help convert data into valuable business information.Continue Reading
How GenAI-created synthetic data improves augmentation
Synthetic data can enhance the performance and capabilities of data augmentation techniques. Navigate the challenges generative AI models present to reap the benefits.Continue Reading
What is master data management (MDM)?
Master data management (MDM) is a process that creates a uniform set of data on customers, products, suppliers and other business entities from different IT systems.Continue Reading
data structure
A data structure is a specialized format for organizing, processing, retrieving and storing data.Continue Reading
database management system (DBMS)
A database management system (DBMS) is a software system for creating and managing databases.Continue Reading
relational database
A relational database is a type of database that organizes data points with defined relationships for easy access.Continue Reading
Graph database vs. relational database: Key differences
Graph databases offer plenty of advantages for enterprises, but relational databases still top the market. Both emphasize relationships between data, but how do they compare?Continue Reading
data mesh
Data mesh is a decentralized data management architecture for analytics and data science.Continue Reading
RDBMS (relational database management system)
A relational database management system (RDBMS) is a collection of programs and capabilities that enable IT teams and others to create, update, administer and otherwise interact with a relational database.Continue Reading
data de-identification
Data de-identification is decoupling or masking data, to prevent certain data elements from being associated with the individual.Continue Reading
What is data management and why is it important? Full guide
Data management is the process of ingesting, storing, organizing and maintaining the data created and collected by an organization, as explained in this in-depth guide.Continue Reading
database (DB)
A database is a collection of information that is organized so that it can be easily accessed, managed and updated.Continue Reading
Managing databases in a hybrid cloud: 10 key considerations
To manage hybrid cloud database environments, consider business and application goals, consistency, configuration management, synchronization, latency, security and stability.Continue Reading
hashing
Hashing is the process of transforming any given key or a string of characters into another value.Continue Reading
database administrator (DBA)
A database administrator (DBA) is the information technician responsible for directing and performing all activities related to maintaining and securing a successful database environment.Continue Reading
information
Information is the output that results from analyzing, contextualizing, structuring, interpreting or in other ways processing data.Continue Reading
The 5 components of a DataOps architecture
Reaping the benefits of DataOps requires good architecture. Use five core components to design a DataOps architecture that best fits organizational needs.Continue Reading
Hadoop Distributed File System (HDFS)
The Hadoop Distributed File System (HDFS) is the primary data storage system Hadoop applications use.Continue Reading
query
A query is a question or a request for information expressed in a formal manner.Continue Reading
database as a service (DBaaS)
Database as a service (DBaaS) is a cloud computing managed service offering that provides access to a database without requiring the setup of physical hardware, the installation of software or the need to configure the database.Continue Reading
data classification
Data classification is the process of organizing data into categories that make it easy to retrieve, sort and store for future use.Continue Reading
columnar database
A columnar database (column-oriented) is a database management system (DBMS) that stores data on disk in columns instead of rows.Continue Reading
schema
In computer programming, a schema (pronounced SKEE-mah) is the organization or structure for a database, while in artificial intelligence (AI), a schema is a formal expression of an inference rule.Continue Reading
spatial data
Spatial data is any type of data that directly or indirectly references a specific geographical area or location.Continue Reading
Vector vs. graph vs. relational database: Which to choose?
Vector databases enhance the use of generative AI. Organizations should consider how vector capabilities stack up vs. graph and relational databases before deciding which to use.Continue Reading
serverless database
A serverless database is a type of cloud database that is fully managed for an organization by a cloud service provider and runs on demand as needed to support applications.Continue Reading
Top 10 industry use cases for vector databases
Vector database popularity is rising as generative AI use increases across all industries. Here are 10 top use cases for vector databases that generate organizational value.Continue Reading
data analytics (DA)
Data analytics (DA) is the process of examining data sets to find trends and draw conclusions about the information they contain.Continue Reading
RFM analysis (recency, frequency, monetary)
RFM analysis is a marketing technique used to quantitatively rank and group customers based on the recency, frequency and monetary total of their recent transactions to identify the best customers and perform targeted marketing campaigns.Continue Reading
entity relationship diagram (ERD)
An entity relationship diagram (ERD), also known as an 'entity relationship model,' is a graphical representation that depicts relationships among people, objects, places, concepts or events in an information technology (IT) system.Continue Reading
big data management
Big data management is the organization, administration and governance of large volumes of both structured and unstructured data.Continue Reading
big data
Big data is a combination of structured, semi-structured and unstructured data that organizations collect, analyze and mine for information and insights.Continue Reading
data modeling
Data modeling is the process of creating a simplified visual diagram of a software system and the data elements it contains, using text and symbols to represent the data and how it flows.Continue Reading
raw data (source data or atomic data)
Raw data is the data originally generated by a system, device or operation, and has not been processed or changed in any way.Continue Reading
data engineer
A data engineer is an IT professional whose primary job is to prepare data for analytical or operational uses.Continue Reading
Cloud DBA: How cloud changes database administrator's role
Cloud databases change the duties and responsibilities of database administrators. Here's how the job of a cloud DBA differs from what an on-premises one does.Continue Reading
flat file
A flat file is a collection of data stored in a two-dimensional database in which similar yet discrete strings of information are stored as records in a table.Continue Reading
Microsoft SQL Server
Microsoft SQL Server is a relational database management system (RDBMS) that supports a wide variety of transaction processing, business intelligence (BI) and data analytics applications in corporate IT environments.Continue Reading
Data management trends: GenAI, governance and lakehouses
The top data management trends of 2023 -- generative AI, data governance, observability and a shift toward data lakehouses -- are major factors for maximizing data value in 2024.Continue Reading
star schema
A star schema is a database organizational structure optimized for use in a data warehouse or business intelligence that uses a single large fact table to store transactional or measured data, and one or more smaller dimensional tables that store ...Continue Reading
data quality
Data quality is a measure of a data set's condition based on factors such as accuracy, completeness, consistency, reliability and validity.Continue Reading
data warehouse as a service (DWaaS)
Data warehouse as a service (DWaaS) is an outsourcing model in which a cloud service provider configures and manages the hardware and software resources a data warehouse requires, and the customer provides the data and pays for the managed service.Continue Reading
What is data governance and why does it matter?
Data governance is the process of managing the availability, usability, integrity and security of the data in enterprise systems, based on internal standards and policies that also control data usage.Continue Reading
Google Bigtable
Google Bigtable is a distributed, column-oriented data store created by Google Inc. to handle very large amounts of structured data associated with the company's Internet search and Web services operations.Continue Reading
Top 12 data observability use cases
Experts identify 12 top data observability use cases and examine how they influence all aspects of data management and governance operations.Continue Reading
MPP database (massively parallel processing database)
An MPP database is a database that is optimized to be processed in parallel for many operations to be performed by many processing units at a time.Continue Reading
big data engineer
A big data engineer is an information technology (IT) professional who is responsible for designing, building, testing and maintaining complex data processing systems that work with large data sets.Continue Reading
fact table
In data warehousing, a fact table is a database table in a dimensional model. The fact table stores quantitative information for analysis.Continue Reading
ESG data collection: Beginning steps and best practices
Sustainability initiatives won't succeed without quality data. Following an ESG data collection framework and best practices ensures program and reporting success.Continue Reading
5V's of big data
The 5 V's of big data -- velocity, volume, value, variety and veracity -- are the five main and innate characteristics of big data.Continue Reading
Assemble the 6 layers of big data stack architecture
Assemble the six layers of a big data stack architecture to address the challenges organizations face with big data, which include increases in data size, speed and structure.Continue Reading
OLAP (online analytical processing)
OLAP (online analytical processing) is a computing method that enables users to easily and selectively extract and query data in order to analyze it from different points of view.Continue Reading
C++
C++ is an object-oriented programming (OOP) language that is viewed by many as the best language for creating large-scale applications. C++ is a superset of the C language.Continue Reading
dimension
In data warehousing, a dimension is a collection of reference information that supports a measurable event, such as a customer transaction.Continue Reading
disambiguation
Disambiguation is the process of determining a word's meaning -- or sense -- within its specific context.Continue Reading
consumer privacy (customer privacy)
Consumer privacy, also known as customer privacy, involves the handling and protection of the sensitive personal information provided by customers in the course of everyday transactions.Continue Reading
ACID (atomicity, consistency, isolation, and durability)
In transaction processing, ACID (atomicity, consistency, isolation, and durability) is an acronym and mnemonic device used to refer to the four essential properties a transaction should possess to ensure the integrity and reliability of the data ...Continue Reading
snowflaking (snowflake schema)
In data warehousing, snowflaking is a form of dimensional modeling in which dimensions are stored in multiple related dimension tables.Continue Reading
data virtualization
Data virtualization is an umbrella term used to describe an approach to data management that allows an application to retrieve and manipulate data without requiring technical details about the data.Continue Reading
conformed dimension
In data warehousing, a conformed dimension is a dimension that has the same meaning to every fact with which it relates.Continue Reading
dimension table
In data warehousing, a dimension table is a database table that stores attributes describing the facts in a fact table.Continue Reading
15 ways AI influences the data management landscape
AI, NLP and machine learning advancements have become core to data management processes. Ask tool vendors how they use -- or fail to use -- AI in these 15 areas.Continue Reading
How to create a data quality management process in 5 steps
Data quality requires accurate and complete data that fits task-based needs. These five steps establish a data quality management process to ensure data fits its purpose.Continue Reading
Hadoop
Hadoop is an open source distributed processing framework that manages data processing and storage for big data applications in scalable clusters of computer servers.Continue Reading
data lakehouse
A data lakehouse is a data management architecture that combines the key features and the benefits of a data lake and a data warehouse.Continue Reading
How to approach data mesh implementation
Data mesh takes a decentralized approach to data management, setting it apart from data lakes and warehouses. Organizations can transition to data mesh with progressive steps.Continue Reading
data pipeline
A data pipeline is a set of network connections and processing steps that moves data from a source system to a target location and transforms it for planned business uses.Continue Reading
Essential skills for data-centric developers
To become more data-driven, organizations need data-centric developers. Developers can learn a mix of technical and interpersonal skills to be an attractive candidate for the job.Continue Reading
Data steward responsibilities fill data quality role
Data stewards tie together data operations. From quality to governance to boosting collaboration, data stewards are valuable members of any data effort.Continue Reading
Enhance data governance with distributed data stewardship
Data stewardship and distributed stewardship models bring different tools to data governance strategies. Organizations need to understand the differences to choose the best fit.Continue Reading
Data lakes: Key to the modern data management platform
Data lakes influence the modern data management platform at all levels. Organizations can gain faster insights, save costs, improve governance and boost self-service data access.Continue Reading
data integration
Data integration is the process of combining data from multiple source systems to create unified sets of information for both operational and analytical uses.Continue Reading
transcription error
A transcription error is a type of data entry error commonly made by human operators or by optical character recognition (OCR) programs.Continue Reading
MongoDB
MongoDB is an open source NoSQL database management program.Continue Reading
Data stewardship: Essential to data governance strategies
As data governance gets increasingly complicated, data stewards are stepping in to manage security and quality. Without one, organizations lose speed, quality info and opportunity.Continue Reading
data warehouse
A data warehouse is a repository of data from an organization's operational systems and other sources that supports analytics applications to help drive business decision-making.Continue Reading
database replication
Database replication is the frequent electronic copying of data from a database in one computer or server to a database in another -- so that all users share the same level of information.Continue Reading
DataOps
DataOps is an Agile approach to designing, implementing and maintaining a distributed data architecture that will support a wide range of open source tools and frameworks in production.Continue Reading
Comparing DBMS vs. RDBMS: Key differences
A relational database management system is the most popular type of DBMS for business uses. Find out how RDBMS software differs from other DBMS technologies.Continue Reading
data observability
Data observability is a process and set of practices that aim to help data teams understand the overall health of the data in their organization's IT systems.Continue Reading
What key roles should a data management team include?
These 10 roles, with different responsibilities, are commonly a part of the data management teams that organizations rely on to make sure their data is ready to use.Continue Reading
Data tenancy maturity model boosts performance and security
A data tenancy maturity model can boost an organization's data operations and help improve the protection of customer data. Improvement is tracked through tiers of data tenancy.Continue Reading
What is a data warehouse analyst?
Data warehouse analysts help organizations manage the repositories of analytics data and use them effectively. Here's a look at the role and its responsibilities.Continue Reading
Data observability benefits entire data pipeline performance
Data observability benefits include improving data quality and identifying issues in the pipeline process, but also has challenges organizations must solve for success.Continue Reading
OPAC (Online Public Access Catalog)
An OPAC (Online Public Access Catalog) is an online bibliography of a library collection that is available to the publicContinue Reading