data life cycle
The data life cycle is the sequence of stages that a particular unit of data goes through from its initial generation or capture to its eventual archival and/or deletion at the end of its useful life.
Although specifics vary, data management experts often identify six or more stages in the data life cycle. Here's one example:
- Generation or capture: In this phase, data comes into an organization, usually through data entry, acquisition from an external source or signal reception, such as transmitted sensor data.
- Maintenance: In this phase, data is processed prior to its use. The data may be subjected to processes such as integration, scrubbing and extract-transform-load (ETL).
- Active use: In this phase, data is used to support the organization’s objectives and operations.
- Publication: In this phase, data isn’t necessarily made available to the broader public but is just sent outside the organization. Publication may or may not be part of the life cycle for a particular unit of data.
- Archiving: In this phase, data is removed from all active production environments. It is no longer processed, used or published but is stored in case it is needed again in the future.
- Purging: In this phase, every copy of data is deleted. Typically, this is performed on data that is already archived.
Data lifecycle management (DLM) is becoming increasingly important since the explosion of big data and the ongoing development of the Internet of Things (IoT). Enormous volumes of data are being generated by an ever-increasing number of devices all over the world. Proper oversight of data throughout its life cycle is essential to optimize its usefulness and minimize the potential for errors. Finally, archiving or deleting data at the end of its useful life ensures that it does not consume more resources than necessary.