EMC Corp. made available the 3.0 release of its XtremIO all-flash array today with new inline compression capabilities and performance improvements – but existing customers who want the software upgrade need to prepare for significant disruption to their production environments.
Josh Goldstein, vice president of marketing and product management for EMC’s XtremIO business unit, confirmed that users will need to move all of their data off of their XtremIO arrays to do the upgrade and then move it back onto the system once the work is complete.
EMC’s XtremIO division has taken some heat on the disruptive – and some say “destructive” – nature of the upgrade, especially in view of the company’s prior claims that the product supported non-disruptive upgrades (NDU) of software.
Goldstein said the company decided to make an exception for the 3.0 upgrade based on customer input about the inline compression capabilities, which he claimed could double the usable capacity of an XtremIO array in many cases.
“This was a choice that was made, and it was not an easy choice,” said Goldstein. “We could have delayed the feature. Originally we were planning to put this in later in the roadmap. If we had chosen to, we could have tied this to another hardware release, and it would have been something that existing customers could never take advantage of. Our customer base told us emphatically that that was not what they wanted.”
Goldstein said that EMC will provide, at no cost to customers, the option of professional services and extra XtremIO “swing” capacity, if necessary, to ensure that they have an equivalent system on which to put their data while the upgrade is taking place.
The disruptive nature of the 3.0 upgrade came to light recently through an “XtremIO Gotcha” blog post from Andrew Dauncey, a leader of the Melbourne, Australia, VMware user group (VMUG). Dauncey wrote: “As a customer with limited funds, this is the only array for a VDI project, where the business runs 24/7, so to have to wipe the array has massive impacts.” He said a systems integrator had offered a loan device to help with the upgrade.
Dauncey worked as a systems engineer at a public hospital in Australia at the time of his initial post on Sept. 14. He has since gone to work for IBM as a virtualization specialist.
In a blog post last Sunday, Dauncey noted EMC’s “marketing collateral” that advertised “non-disruptive software and firmware upgrades to ensure 7×24 continuous operations,” and he accused EMC of “false advertising” prior to the release of the updated 3.0 firmware.
Goldstein said, “The releases that we’ve had from the time the product went GA up until now were all NDU. The releases that we have going forward after this point will all be NDU as well.”
Chris Evans, an IT consultant who writes the blog “Architecting IT,” said via an e-mail that HP upgraded its platform to cater to flash, and SolidFire released a new on-disk structure in the operating system for its all-flash array without disruptive upgrades. He said, what’s surprising in the XtremIO case is “that EMC didn’t foresee the volume of memory to store metadata only 12 months after their first release of code.”
Chad Sakac, a senior vice president of global systems engineering at EMC, shed some light on the technical underpinnings of the upgrade through his “Virtual Geek” personal blog, which he said has no affiliation to the company. He said the 2.4 to 3.0 upgrade touches both the layout structure and metadata indirection layer, and as a result, is disruptive to the host. He pointed to what he said were similar examples from “the vendor ecosystem.”
Goldstein confirmed that the block size is changing from 4KB to 8 KB, but he said the block-size change is not the main reason for the disruptive upgrade. He said it’s “all these things taken together” that the company is doing to both add compression and improve performance.
“We already had inline deduplication in the array, and that means that you have to have metadata structures that can describe how to take unique blocks and reconstitute them into the information that the customers originally stored,” Goldstein said. “When you add inline compression, you have to have similar metadata information about how to reconstitute compressed blocks into what the customer originally stored. Those kinds of changes are things that change the data structures in the array, and that’s what we had to update.”
Goldstein said customers should not have to endure an outage in the “vast majority of cases.” Goldstein claimed that XtremIO is “overwhelmingly used” in virtual environments and moving virtualized workloads is not difficult. He mentioned VMware’s Storage VMotion and EMC’s PowerPath Migration Enabler as two of the main options to help, but he said there are others.
Customers also may choose to remain on the 2.4 code that EMC released in May. Goldstein said that EMC will continue to provide bug fixes on the prior 2.4 release for “quite a long time.”
“There’s nothing forcing them to upgrade,” he said.
Craig Englund, principal architect at Boston Scientific Corp., said EMC contacted Boston Scientific’s management about the disruptive upgrade in the spring. At the time, the IT team already had a loaner array for test purposes, and they asked to keep it longer after learning the 3.0 upgrade was destructive.
“It reformats the array. It’s destructive. It’s not just disruptive,” Englund said. “You have to move all of your data off the array for them to perform the upgrade.”
But, Englund said the team can “move things around storage-wise non-disruptively” because the environment is highly virtualized through VMware. He said he’s willing to go through the inconvenience to gain the ability to run more virtual desktops and SQL Server databases on the company’s existing XtremIO hardware. Early tests have shown a 1.9 to 1 capacity improvement for the database workloads and 1.3 to 1 for VDI, he said.
“They could have said, ‘If you want these new features, it’s coming out in the next hardware platform, and you’ll have to buy another frame to get it.’ But, they didn’t, and I think that’s great,” Englund said. “To try to get this out to all of the existing customers before they get too many workloads on them, I think, was considerate.”