chris - Fotolia

WANdisco, Azure do data migration dance

WANdisco has deepened its Microsoft partnership, integrating LiveData as a native service in Azure. Customers can access LiveData through Azure Portal for backup and other uses.

WANdisco has integrated its big data migration with the Microsoft Azure cloud.

WANdisco LiveData Platform for Azure -- in customer preview -- is designed to make it easier to move petabytes of data to Azure. Customers can discover LiveData through Marketplace and access its services directly through Portal and Azure command line interface (CLI). With LiveData, customers can perform large-scale migration of Hadoop data to Azure, and enable backup and disaster recovery (DR) in the cloud and cloud bursting. As a native service, LiveData Platform for Azure will show up on the same bill as Azure.

WANdisco also launched LiveData Migrator and LiveData Plane for the new Azure-based platform. These two work together to allow consistency between an on-premises Hadoop environment and Azure Data Lake Storage. LiveData Migrator performs a one-time scan of the on-premises data and feeds it to LiveData Plane, which captures any changes after that point.

LiveData can scan through petabyte-scale data and generate a copy in the cloud while ensuring both copies are the same. It is powered by WANdisco Fusion, a consensus engine that keeps data consistent and available across multiple environments. Because it is a single scan and data migration is continuous, nothing needs to be shut down. This integration with Azure makes it easier for Azure customers to discover and deploy LiveData.

LiveData's ability to move petabytes of data without interrupting production and without risk of losing the data midflight is something no other vendor does, said Merv Adrian, Gartner research vice president of data and analytics. Moving data at this scale takes a long time, and traditionally involves a combination of physically shipping servers loaded with data to a cloud provider and/or transferring data to the cloud during non-peak hours. The data is inaccessible during migration using these methods. Adrian said as a result, enterprises tend not to move live, active data this way.

"Taking everything down until I'm finished isn't an option," Adrian said.

LiveData doesn't technically "finish" the migration until later, but customers can access and make changes to all the data mid-migration. LiveData ensures those changes are reflected in all copies. Adrian said that's an important differentiator from other migration tools.

WANdisco LiveData does not yet have similar integration with AWS or Google Cloud, but Adrian said that the Azure integration makes most sense. AWS has larger adoption, but Adrian pointed out that AWS and Google have no on-premises presence -- those customers are already on the cloud. Microsoft customers are most likely hybrid, running Microsoft products in their data centers while also dipping into Azure for their cloud needs. They are the customers most likely looking to juggle petabytes of data between on-premises and cloud.

screenshot of WANdisco LiveData in Azure
LiveData Platform for Azure can be discovered and deployed in the Azure Portal.

WANdisco CEO and founder David Richards said WANdisco focuses on serving the enterprise market. He said while AWS has higher general market adoption, it has similar adoption among enterprises as Azure. He also said Azure adoption is growing faster among the enterprise, partly because Microsoft's office productivity and collaboration tools both on- and off-premises are widely popular.

Richards said cloud demand is spiking because of an increase in at-home workers as well as companies investing in AI and machine learning. Business has slowed across the board due to the COVID-19 pandemic, and companies are thinking of ways to modernize and transform their businesses in response. Investing in AI -- specifically, the ability to make better decisions automatically -- is a way for businesses to differentiate themselves.

"Businesses have to now reinvent themselves, but that has to come with severe IT mobilization," Richards said. "The boldest move a company can make is looking at AI."

Adrian brought up another point about the interplay between COVID-19 and cloud: many businesses are looking to cut costs, and CTOs are going to look at putting hardware on the chopping block. He said it depends on the workload, but in most cases, the total cost of ownership over three years for hosting on the cloud is cheaper than provisioning all the necessary hardware, floor space and cooling to host it on-premises.

Determining these costs and identifying which workloads are actually cheaper on the cloud is still a "black art," Adrian said. It takes meticulous modeling to map out costs, and those models could still be wrong because the demands of the workloads and the cost of the cloud could grow or shrink unpredictably. However, Adrian said AI and machine learning are absolutely better done on the cloud because of the "bursty" nature of their compute demands.

Next Steps

WANdisco introduces Hive metadata migration to Databricks

Dig Deeper on Cloud disaster recovery

Data Backup