Top data backup software trends: Source dedupe and virtual machine backup

Source dedupe, improved virtual machine backup top list of 2010 trends in data backup software; snapshots, replication, and CDP also poised for long-term impact on traditional backup.

This year's most prominent trends in data backup software centered on the mainstream acceptance of source deduplication, improvements in virtual machine backup, and the impact of storage snapshots, remote replication and continuous data protection (CDP) on the traditional backup process.

A common thread in each of the trends is the industry's attempt to address the problem of explosive data growth, as IT shops confront escalating pressure to complete their backups within a timeframe that won't adversely affect their businesses.

"When you're trying to stuff more data, and time is a fixed variable, you've got to figure out ways to get it done faster," said Lauren Whitehouse, a senior analyst at Enterprise Strategy Group (ESG) in Milford, Mass. "Any method of moving less data for backup purposes, or streaming it faster, or processing it faster is definitely the driver."


<< Deduplication in data backup software
<< Source dedupe helps property company with remote backup
<< Virtual machine backup: Still a work in progress
<< VM backup product key piece of construction firm's backup strategy
<< Organizations more motivated to switch data backup software


In interviews of 146 Fortune 1000 companies, the InfoPro Inc. found that of 46% using data reduction/deduplication, 6% are piloting or evaluating the technology, and 31% have it in their near-term or long-term plans. Only 18% indicated they have no plans for data deduplication.

But, among those using deduplication, there are currently more dedicated appliance deployments than data backup software-based implementations, according to Dave Russell, a research vice president at Gartner Inc.

"What we're seeing so far is that some organizations are electing to do both," Russell said, "but they do it selectively, maybe doing files in the backup software and large databases with the appliances."

Source deduplication attacks the problem earlier in the process than target deduplication, removing redundant data at the application server, often through an agent installed on the server. The main advantage is the system sends less data over the network, reducing the strain on bandwidth; the chief downside is the additional processing overhead on the application server.

"That was a pretty big innovation, and it's definitely been a really popular feature for backup," said Rachel Dines, an analyst at Forrester Research Inc. in Cambridge, Mass. "It's great on file systems. I'm seeing a lot of people use it in virtual environments, because there's a lot of redundancy there as well. I'm also seeing people actually do backups over the WAN with source-side deduplication because there's so little data being sent."


AMB Property Corp. is finding source dedupe especially helpful in simplifying backup at far-flung data centers in San Francisco (primary site), Las Vegas (disaster recovery site), Boston, Shanghai, Tokyo and Amsterdam. The company had first tried backing up its server systems to tape over the WAN, but it encountered trouble getting a consistent success rate.

After exhausting other options, the IT operations team this year elected to deploy grids of servers running EMC Corp.'s Avamar and install agents across the WAN links. The agents, which are licensed based on storage capacity, deduplicate the data on the backup clients/application servers.

"The first backup is the most time consuming and the heaviest one, but each subsequent backup goes very, very quickly, sending only the changed data," said Jason Leong, vice president of network operations at AMB.

A full backup of a data store once took eight to 12 hours and consumed a significant amount of bandwidth, but it now takes 30 minutes to 1.5 hours, according to Leong. The footprint is about 8 TB to 10 TB, rather than the roughly 30 TB that AMB would have had in the absence of deduplication. Daily backups of changed data are sometimes as low as 0.5 GB.

An ancillary benefit is the replication of the backup data to an additional Avamar system, protecting the data in two places geographically, Leong noted.

At its central data center, AMB currently uses Avamar's deduplication as well as traditional backup, with EMC's NetWorker and virtual tape libraries (VTLs). But, the company is working to integrate systems to enable NetWorker to use Avamar for backend storage and target-based deduplication, as it tries to shift away from VTLs and tapes, Leong said.


As was the case at AMB, a shift in backup strategy often coincides with the increasing use of virtual servers. In turn, scores of vendors have been working to ease the backup of virtual machines (VMs).

"All of these guys, bar none, are making it easier to work with VMware, in particular," said Arun Taneja, founder and senior analyst at Taneja Group. "There were dramatic changes over the past year: a lot more visibility and much better efficiency in extracting duplication out of virtual machines.

But, in the grand scheme of things, we're in the very early stages of how to protect virtual machines.


Arun Taneja
founder and senior analystTaneja Group

"But, in the grand scheme of things, we're in the very early stages of how to protect virtual machines," Taneja added. "In spite of the innovations, backup is still pretty primitive in the virtual server world today."

VMware Inc., the leading virtual server technology vendor, made available vStorage APIs for Data Protection to enable data backup software to perform central VM backups without the overhead of running backup tasks from inside individual VMs.

Major data backup software vendors followed suit with improvements, in response to continuing pressure from VM backup specialists such as PHD Virtual Technologies Inc., Quest Software Inc. and Veeam Software Corp., each of which designed software with virtual server environments in mind.

Bill Wheeler, a Windows operations manager at VW Credit Inc. (VCI), a subsidiary of Volkswagen Group of America Inc., said the company had no desire to abandon its Symantec Corp. NetBackup software as part of its backup overhaul to accommodate its heavily virtualized server environment.

Instead, VW Credit added Quest's vRanger to supplement NetBackup. Prior to vRanger, VW Credit had been backing up its VMs as if they were physical machines, installing a NetBackup agent on each VM and backing up within the VM. In contrast, vRanger installs outside the VM and backs up the virtual machine disk (VMDK) files, Wheeler said.

"If I've got 20 virtual servers on the [physical] machine, vRanger grabs those 20 VMDK files and backs them up to whatever the destination media is," which in VW Credit's case, is ExaGrid Systems Inc.'s backup appliance with deduplication, Wheeler said. "It takes the I/O and network performance hit out from within the box, so I'm no longer incurring that performance hit within the virtual machine itself."

But, now that Symantec's NetBackup has become "more virtualization-aware," Wheeler is considering the removal of vRanger in favor of a NetBackup-only approach.


Shawn Partridge, vice president of IT at Rockford Construction Co. has a different philosophy. He can't envision reversing his decision to use Veeam Backup & Replication to return to Symantec's Backup Exec or any other traditional data backup software product.

"Going with a system that's been tweaked vs. a system built from the ground up to do this, it really wasn't a tough decision at all," Partridge said, noting that the Veeam software interacts directly with the VMware ESX Servers, without need of agents. "It triggers a snapshot on ESX, so there is nothing loaded on each of the servers."

This fall's TechTarget Storage Purchasing Intention survey showed that users of VM-specific backup products are in the minority. Among the 204 IT professionals backing up virtual servers, 40% used traditional backup and recovery software. The No. 2 approach was VMware Consolidated Backup (VCB), at 26%, followed by physical servers (17%), VM-specific products (11%) and CDP (6%).

Many IT shops opt for different products to back up their virtual and physical servers. An online survey conducted earlier this year by ESG showed that 56% of the 186 IT professional responsible for data protection used separate backup applications for their virtual and physical servers, and only 23% viewed that as their preferred approach.

By contrast, most of the 44% who use a single backup application for their virtual and physical environments were content; 77% indicated that was their preferred approach, according to the ESG research.


Some industry analysts said they've noticed a greater willingness among IT shops to re-think their backup strategies, and even switch to other data backup software, to accommodate their changing needs.

Gartner's Russell predicted that over 30% of users will swap out their backup applications during the next four years in response to the "three Cs" of cost, complexity and capability.

"There has always been a myth that nobody switches backup products, which isn't actually very true. People are willing to ditch their backup products more than ever," Russell said.

San Francisco-based Union Bank is a long-time user of IBM's Tivoli Storage Manager (TSM), yet the IT department is weighing a switch to a more scalable product, according to Claudia Ku, senior vice president of IT. "We are undecided, because it's going to be a pain to go from TSM," Ku said.

Randy Kerns, a senior strategist at Evaluator Group, said users need to consider data protection, tiering and archiving as related technologies. One 2010 trend that Kerns noted was the integration of backup and archiving in products such as Symantec's Backup Exec and CommVault Systems Inc.'s Simpana.

Farm Credit Services of Mid-America, for instance, uses Simpana to back up data to its Spectra Logic Corp. nTier500 VTL-based disk system as well as to archive data to partitions in its Spectra Logic T50e tape library.

"We use it as another level of storage, almost like a fourth tier," said Fred Gordon, a storage administrator at the Louisville, Ky.-based financial institution. "We select certain criteria in data, and the system actually moves it to this tier of storage. To the end users, they really can't tell the difference. It's not that big of a wait time, because most of the stuff we're putting on there is older data anyway."

We're already seeing backup software start to be able to control snapshots on the array and replication.


Rachel Dines
analyst, Forrester Research

Forrester's Dines predicted a shift toward "the all-encompassing continuity suite" that might incorporate not only backup and recovery but also deal with snapshots, replication, CDP and possibly even failover orchestration.

"Symantec's a great example," she said. "They have products in CDP and in replication and in backup and in archive. I'm really expecting to see that come together into a single console to be managed from one platform.

"We're already seeing backup software start to be able to control snapshots on the array and replication," Dines continued, "If the backup software can extend to be the manager and monitor of all sorts of continuity technologies and activities, that would be really interesting. I think that's a direction a lot of providers are taking."

Lauren Whitehouse noted a number of vendors that beefed up their capabilities to leverage snapshots in primary storage systems, suggesting "a different way of capturing the data to accelerate the backup process."

Eric Slack, an analyst at Storage Switzerland LLC, pointed to trends with products that "effectively do data protection without a backup system," such as StorSimple Inc.'s hybrid cloud product providing snapshot-based backups and Nimble Storage Inc.'s iSCSI array that converges storage, backup and DR.

"This is the direction I think things are going," Slack wrote in an email. "If you can integrate data protection into your storage infrastructure and skip the backup process and backup products at the same time, why not? You've improved your backup."

Dig Deeper on Backup and recovery software

Disaster Recovery