Getty Images/iStockphoto

Amazon launches DataZone, a new data management service

The tech giant's new service provides data governance, collaboration and catalog capabilities that enable organizations to find and operationalize data to inform decisions.

AWS launched Amazon DataZone, a new data management service that enables customers to govern, catalog and share data within their organization.

The tech giant first unveiled Amazon DataZone at re:Invent, its annual user conference, in November 2022; moved it into public preview in March 2023; and made the service generally available on Oct. 4.

The intent of Amazon DataZone is to provide a single environment in which data scientists, engineers and other developers as well as data analysts and other data consumers can access and share their organization's data in a governed manner to collectively reach decisions that lead to actions, according to AWS.

Recently, AWS' product development has focused on generative AI, as have the product development plans of most data management and analytics vendors.

In July, the tech giant introduced new Bedrock services that makes foundational generative AI models from different vendors available through an API. In addition, the same month AWS unveiled two new generative AI tools for QuickSight, its main analytics platform.

Amazon DataZone, meanwhile, is a traditional cloud-based data management service designed to help customers govern and operationalize data at scale. AWS is initially making it available to all customers in 11 of its regions, including three U.S. regions and three European regions, and customers can start with a free trial that includes 50 users for three months.

Pricing otherwise begins with a monthly subscription of $9 per user for the first 500 users, $8.10 per user for the next 500 and $7.20 for all users over 1,000. Each monthly subscription – there are no discounts for long-term commitments – includes 20 MB of metadata storage, 4,000 requests and 0.2 compute units.

New capabilities

Amazon DataZone comes with four primary capabilities aimed at enabling customers to make their growing amounts of data more accessible and usable.

Worldwide, data is growing at an exponential rate. Within individual organizations, data volume and complexity are similarly on the rise as enterprises collect data from more sources. Tools that help organizations more easily manage their data, therefore, are critical.

Amazon DataZone's capabilities include the following:

  • A data portal outside the AWS Management Console in the form of a web application where authenticated users can find, catalog and work with data in a self-service manner.
  • A data catalog so customers can define data across their organization making it easy to find data that can be operationalized to train models, populate dashboards and inform decisions.
  • An environment where users can create groupings of people, data assets and analytics tools for collaborative analysis and decision making.
  • Access control and other governance measures that set parameters on who can access certain data as well as which employees have ownership over data they can parse out as requested by others.

One of the primary benefits of Amazon DataZone will be greater efficiency by enabling collaboration and reducing the need to recreate data, according to Stephen Catanzano, an analyst at TechTarget's Enterprise Strategy Group. As a result, the new service is an important addition to the AWS platform.

It is a significant improvement to empower users to share data resources within DataZone [so they can] be more efficient. One line-of-business user can create data, add it to a catalog, and others can then use it. Cataloging is part of a big movement to support the reuse of data rather than re-creation.
Stephen CatanzanoAnalyst, Enterprise Strategy Group

"It is a significant improvement to empower users to share data resources within DataZone [so they can] be more efficient," he said. "One line-of-business user can create data [and] add it to a catalog, and others can then use it. Cataloging is part of a big movement to support the reuse of data rather than re-creation."

In addition, the service stands to make data workers more efficient by creating an environment in which data is easy to access and quality is heightened by governance measures, Catanzano continued.

"Time and quality from producer to consumer are much more efficient and controlled, which are important, he said. "Everything in data is moving toward real-time or near real-time. If someone sees a sudden demand for something, with DataZone, they can spin up a campaign and get it out fast. That's very valuable. This eliminates manual steps and increases data reuse and collaboration."

Matt Aslett, an analyst at Ventana Research, likewise noted the significance of Amazon DataZone for AWS customers, calling it one of the more important products the tech giant introduced last fall.

Specifically, the service could provide some of the functionality needed to implement a data mesh approach to data management, he said. Data mesh connects an organization's data through a data catalog but decentralizes data management to enable domain experts -- for example, human resources or marketing experts -- to oversee departmental data.

He added that data catalogs, in particular, are gaining importance and called them an "indispensable enabler of good data governance."

A Ventana Research survey showed that three-quarters of organizations with more than 100 data catalog users are confident in their organization's ability to govern and manage data across the business while just over half 100 or less data catalog users were similarly confident.

From preview to production

When first unveiled in preview, Amazon DataZone didn't yet have some of the functionality it now has upon its release to the public, according to AWS.

For example, now the data catalog can be customized with automatic metadata generation based on machine learning to name data assets and columns, a capability not initially available. In addition, among other features, certain governed data sharing capabilities such as subscription approval for access to certain data were added during the preview process.

Looking ahead, Catanzano said AWS should continue making productivity gains a focal point of its roadmap. Just as Amazon DataZone aims to increase collaboration and efficiency, AWS -- and all other data management and analytics vendors -- would be wise to emphasize efficiency through generative AI and other means as it plots its roadmap.

"Like everyone, [AWS should] integrate AI, remove manual processes and increase productivity," Catanzano said. "Similar to [DataZone], but we will see these productivity gains coming everywhere."

Eric Avidon is a senior news writer for TechTarget Editorial and a journalist with more than 25 years of experience. He covers analytics and data management.

Dig Deeper on Data management strategies

Business Analytics
Content Management