Does data deduplication to tape storage make sense?

Tools now exist to deduplicate data to tape, but the concept still has a way to go before it catches on because of potential costs and management issues.

Data deduplication, a key technology driving the popularity of disk backups, is creeping toward playing a role in backups to tape as well. Data backup vendors CommVault Systems Inc. and CA recently launched products that support dedupe to tape as well as disk, and while it may eventually take off, the concept still makes people nervous.

While data deduplication shrinks the amount of data organizations need to back up, there is concern that it may add complexity to the simple backup and recovery processes of tape.

Earlier this year, CommVault released Simpana 8 with the ability to store deduplicated data on tape. Simpana allows for writes to physical tape libraries without requiring reinflation of deduplicated data. CA's ARCserve Backup 12.5 incorporates data deduplication for backups to disk and tape, giving customers the choice of "reinflating" data on writes to tape or copying data directly from deduplicated disk.

But while deduping to tape should allow organizations to buy less tape storage, it could increase the amount of disk needed for backups.

More on data deduplication
Data deduplication a must-have in virtual tape libraries
Target-based data deduplication technology product considerations

Data deduplication: The business case for dedupe
Independent backup expert W. Curtis Preston said if there is a file that needs to be constantly updated and backed up, it could potentially require a number of different tapes for deduping that file to tape.

Simpana uses a container concept to deal with that, with a bulk of the storage coming off of disk. But for files that need to be constantly updated, this raises questions about the size of the container you will need. "The container method will lower the overall dedupe ratio, but you'll have to buy more storage," Preston said. "In order to save more money on tape, it would cost more money on disk."

Mo Cook, senior sales support engineer for Sparks, Md.-based Systems Alliance Inc., agrees that dedupe to tape could greatly complicate a simple restore process. "It is quite possible that multiple mounts would be required to recover a file or group of files that would otherwise exist on one cartridge," he said.

Cook added that delays could arise when recovering and verifying the integrity of the tape dedupe index. And if a dedupe index is corrupted, all data on that tape could become useless. For some, this may be a risk that is simply not worth taking for the benefits of dedupe.

Gregg Paulk, director of information technologies for Anderson Center for Autism in Staatsburg, New York, eliminated tape from his environment by using NEC HydraStor with dedupe for disk backups. He said he doesn't see great value in deduping to tape.

"One of the things I would imagine that would occur with tape is that if you lose a chunk while it's being deduplicated, that tape is basically going to be invalid," Paulk said.

He doesn't see dedupe improving any of the speed, reliability and security issues he used to run into with tape. While Paulk noted that the price point may be good, "if you can't get your data off, it could cost you your job," he said.

Deduping to tape can also make disaster recovery more burdensome, Cook said. When sending tapes offsite, according to Cook, "complete image copies would have to be recreated to provide a copy to be sent offsite, or you would have to send a copy of the index offsite with the deduped tapes."

Dig Deeper on Data reduction and deduplication

Disaster Recovery