Kit Wai Chan - Fotolia

Azure partner Aparavi takes on unstructured data management

Aparavi retired its Active Archive and File Protect & Insight products and brought their features to its new cloud-based data management and automation platform launched on Azure.

Aparavi has taken its unstructured data classification and global search capabilities, added automation and repackaged it into a cloud-based platform.

Simply called the Aparavi Platform, the new product is billed as an intelligent data management platform. It discovers data and metadata across multiple data sources, including on-premises devices and cloud storage. Through automated classification, Aparavi can help organizations determine the business value of their unstructured data, such as if it contains personally identifiable information (PII) or other sensitive information. Aparavi also provides ways to migrate data between storage, as well as ways to set up such migrations automatically. The tool can also trigger actions based on criteria such as access requirement, regulatory policy and risk level.

Aparavi is targeted at organizations with large volumes of unstructured data. Customers can use the Platform's interface to view their data where it's stored, determine the value of that data, set retention policies to automatically move, copy and delete it as needed and perform cost analysis. Aparavi finds and indexes data without copying or moving it first, which means it won't incur any egress charges when dealing with cloud storage.

Screenshot of Aparavi platform
Aparavi aggregates data and metadata from multiple data sources.

The Aparavi Platform launched as a SaaS product on Microsoft Azure Marketplace. However, it is available as an on-premises deployment, and can be purchased directly through Aparavi and its channel partners. Aparavi's Active Archive and File Protect & Insight products were retired with The Platform's launch, but they will continue to be supported by the vendor.

Aparavi Platform is similar to data management products such as Dell EMC's ClarityNow, StrongBox's StrongLink, and software from Hammerspace and Starfish.

Darryl Richardson, chief product evangelist at Aparavi, said organizations' data has become too voluminous and sprawled to be managed manually. Unstructured data growth had been a looming problem as more companies reach petabyte-scale volumes, but COVID-19 exacerbated the sprawl part of the problem. Richardson said there was a big gap in data protection and management for endpoints and laptops, which now hold more critical data due to a mostly at-home workforce. He also said there is a lack of data management expertise in today's IT workforce.

"There's just not that many people out there who can build a data management workflow," Richardson said.

Richardson said Modus, an e-discovery data collection and storage company that participated in Aparavi's early access program, used the platform to reduce its storage footprint. It exemplified an important use case of the platform -- identifying and eliminating ROT (redundant, outdated, trivial information). He said roughly 30% of all data across all organizations is ROT, and holding on to it leads to longer backup workloads, increased storage costs (including for backup copies of the ROT data) and potential liability if it's held beyond its legal retention period.

Two months ago, IDC published a study finding data management a top concern among IT organizations. The study, sponsored by backup vendor Rubrik, also found 80% of IT decision-makers identified data sprawl as a major problem and only 9.2% of organizations have a single, centralized data management system or platform.

Chris Wahl, chief technologist at Rubrik, said this isn't a new problem, but it was difficult to get customer buy-in in the past. That changed with COVID-19, which led to a heavily decentralized employee base. He said lack of automation and data sprawl are two problems that feed off each other.

"You can't run IT when you've got everything all over the place," Wahl said.

Wahl said COVID-19 accelerated public cloud adoption for many organizations, with 60% of the respondents saying they achieved tangible benefits from the cloud. In the early days of cloud, customers wanted everything to work the same on cloud as it did on premises, but Wahl said customers are getting smarter about how to design for cost with the cloud. He said there was already trend of companies getting out of the data center and reaping cost savings on the cloud, and the coronavirus just made it happen sooner.

"Folks were under the gun to transform digitally," Wahl said of COVID-19's impact. "I think it's a great kick in the butt."

Marc Staimer, president of Dragon Slayer Consulting, said the recent uptick in data management news, such as StrongBox updating its StrongLink product to support Linear Tape File System (LTFS) in late June, shouldn't be taken as a problem that has reached some sort of boiling point. For the past four months, every organization's IT efforts were focused on responding to COVID-19 -- building out remote work infrastructure, VPN, Remote Desktop Protocol (RDP) and finding ways to protect all of it. Managing large volumes of decentralized data had simply taken a backseat.

"People have solved their immediate COVID problems, but the old problems never went away. That's why we're seeing demand again," Staimer said.

Staimer said autonomous data management is in demand right now, unlike file-based backup and active archive, which were the functions of Aparavi's File Protect & Insight and Active Archive products. He said Aparavi is a good product on paper, but it will ultimately come down to how automated and how easy it makes data management tasks. He noted Aparavi doesn't have a learning engine, which could be a limiting factor.

Staimer stressed that there is much ambiguity around the term "data management," and that the type of data management Aparavi, Komprise and StrongLink perform is different from what Cohesity, Rubrik and Commvault do. The former group works with the data itself, but the latter is working with backup copies of the data, performing what Staimer describes as closer to "copy data management."

Dig Deeper on Archiving and tape backup

Disaster Recovery