What is copy data management?
Copy data management (CDM) is an approach to reducing storage consumption that involves eliminating the unnecessary duplication of production data.
Backup software and other enterprise applications operate independently, often creating multiple copies of the same data. Redundant copies of the same data, however, can waste storage space, slow network performance, and make accessing or restoring mission-critical data more difficult. CDM software can help eliminate those problems by reducing the number of full copies using data virtualization.
How CDM works
Most copy data management software works by creating one full virtual copy of the data. When unique changes are made to the production environment, the software creates and stores a snapshot of incremental changes at the block level. The snapshot mechanism creates a read/write differencing disk with a parent-child relationship to the backup copy without creating an entire new copy.
Because write operations are not directed to the backup copy, administrators do not have to worry about the contents of the primary backup being changed by accident. Reducing the number of full copies also reduces the chance of server sprawl and improves costs, because valuable storage space is not taken up with unnecessary copies of data.
The term copy data management was made popular by virtualization software vendor Actifio, which Google acquired in 2020. Actifio captured production data, kept a golden master current and spawned unlimited virtual copies for use whenever a copy of production data was needed. In a development, analytics or test environment, this approach is especially important; it means that the development, analytics or test environment can be based on an exact replica of the organization's production data without unnecessarily consuming storage space before it is needed.
Why copy data management is important
As storage capacity expands, it throws the need for copy data management into sharp relief. Data is growing at a steady rate, and unnecessary copies of data take up much-needed storage space. Storage virtualization has benefited backup and recovery, but the creation and storage of extra copies of data can be a pain point.
Because storing multiple copies and backing up often are standard data protection practices, the number of copies can quickly grow out of hand. Efficiency and productivity can get bogged down by excess amounts of copy data, and for many organizations a lagging system won't cut it.
All that extra storage space comes at a cost. Data storage isn't cheap, and the more capacity for copies of data, the more money organizations waste on unnecessary storage expenses. By eliminating extra copies of data, organizations can increase efficiency and free up expensive storage space.
According to storage expert Steve Ricketts, copy data management has these benefits:
- accelerates application release cycles, improves decision-making, and increases efficiency and productivity with fast, easy and self-directed access to copy data in the appropriate format;
- ensures compliance and mitigates security risks by having greater visibility into copy data usage;
- lowers storage administration costs through centralized control, automation and orchestration; and
- reduces storage costs by having the right number of copies of data with the right policies on the right storage.
Copy data management vs. backup
While copy data management is a useful backup tool, it should not be considered a replacement for traditional backup. CDM was not designed for data protection, and was primarily created for storage efficiency. While CDM can help create data recovery points, it does not create a true backup of the data source.
Storage snapshots are used in both traditional data backup and CDM. Though organizations should not consider snapshots a replacement for backup, both snapshots and backup often give peace of mind through redundancy; CDM may not serve the same purpose.
How to find the right CDM product
While some features and capabilities are consistent across copy data management platforms, products vary among vendors.
Vendors that offer CDM include Commvault, Dell Technologies, Delphix, Google and IBM. Catalogic sold its copy data management business to IBM in 2021.
Most CDM vendors sell products that can import data from production platforms into storage managed by their software. Others allow data management across different traditional storage products from other vendors. Mainstream CDM products work with physical and virtual sources, and some offer appliances for secondary storage.
Editor's note: TechTarget editors revised this article in 2023 to improve the reader experience.