E-Handbook: Data catalog best practices rely on teamwork, governance, tools Article 4 of 4

red150770 - Fotolia

Enterprise data marketplace aims to ease self-service chaos

Self-service data preparation can duplicate work and slow down analytics. One possible fix: an internal marketplace where users can 'shop' for data assets.

GRANTS PASS, Ore. -- The distributed data environments mushrooming in many organizations expand the information that's available for BI and analytics applications. But they can also lead to incompatible data silos, overspending on data management tools and duplication of self-service analytics efforts -- problems that the concept of an enterprise data marketplace is designed to address.

An internal offshoot of online data marketplaces where different companies can buy and sell data, the enterprise data marketplace provides a common repository of curated data sets, reports, dashboards and other ready-to-use data assets for business analysts, data scientists and other end users throughout an organization.

"It's kind of a fancy name for a catalog that gives you access to trusted data for analytics projects," said Mike Ferguson, managing director of U.K.-based consulting firm Intelligent Business Strategies Ltd.

Less data preparation, faster analytics

The goal is to minimize the amount of data preparation and querying work that analytics users need to do themselves, Ferguson added during a roundtable discussion he led at the 2019 Pacific Northwest BI & Analytics Summit. Doing so speeds up the analytics process, makes it easier to find relevant information and reduces the risk that different users will end up working with inconsistent data, he said.

Creating an enterprise data marketplace also goes hand in hand with the deployment of a logical data lake architecture that knits together disparate data stores and a unified set of tools for ingesting, preparing and governing data, according to Ferguson. Together, those initiatives can rein in data and technology sprawl, he said. For example, one of his U.K. clients discovered it was running 27 Hadoop clusters in separate silos; another was using 21 different extract, transform and load tools for data integration.

As described by Ferguson, an enterprise data marketplace combines data catalog software with search and other navigation features, collaboration tools and content publishing capabilities. Data lineage and metadata management functions also need to be incorporated, along with policies for data classification and data governance, he said.

Enterprise data marketplace components

In an interview after the session, Ferguson said the marketplace concept isn't mainstream yet, but he's seeing it be adopted by some large organizations. Senior executives there "are realizing that they're going to be blowing a lot more money if they keep doing the same thing," he said.

Data marketplace vs. self-service freedom

But resistance from self-service users who want to continue to gather, prepare and model data on their own could be a potential roadblock on adoption, said other participants at the summit, which brings together a group of consultants and vendor executives to discuss BI, analytics and data management trends.

"How often," asked Harriet Fryman, a former analytics and customer experience exec at IBM who now is its vice president of analyst relations, "do we look at an Excel spreadsheet and say, 'That's exactly what I want'? Why should we expect people to just take a standard data model and go with it?"

Josh Good, senior director of product marketing at BI and data management vendor Qlik, agreed that an enterprise data marketplace could be a hard sell in companies if it isn't seen as something that makes analytics faster and easier. "We have to make the ready-made approach more convenient than the self-service one, and right now, self-service is more convenient in a lot of cases," he said.

After the session, Good said data and analytics silos are big problems -- and they're only likely to get worse in organizations that are deploying a variety of data processing platforms and self-service tools for different applications in individual departments and business units.

If you want to be agile, giving everyone self-service tools and telling them to figure it out isn't the way to do it.
Mike FergusonManaging director, Intelligent Business Strategies Ltd.

"You need a central place for people to go [to find analytics data]," he said. "But the key thing is how do you get that behavior? People want [the data] the way they want it, and you have to serve it to them in a frictionless way."

Not a straitjacket for analytics users

Ferguson said the idea isn't to force data scientists and other analysts to use data that isn't exactly what they need. However, what's available in an enterprise data marketplace can provide a good starting point that increases both analytical and business agility, he added -- even if it only eliminates, say, 70% of the self-service data preparation and analytics process, leaving users to "take it the final mile" in applications. "I'm not suggesting that you put everyone in a straitjacket," Ferguson said. "But if you want to be agile, giving everyone self-service tools and telling them to figure it out isn't the way to do it."

Asked afterward if letting users customize reports, dashboards and other assets runs the risk of creating the same kind of data inconsistencies that the enterprise data marketplace is designed to stamp out, Ferguson said that comes down to implementing effective policies to govern the process. "As long as there's some governance of publishing on this, then it can stay organized."

In addition, some new roles may need to be created on data management teams to make a marketplace work, Ferguson said. For example, he pointed to the need for a data catalog administrator and a taxonomy designer, as well as people designated as producers, approvers, publishers and stewards of the data and analytics assets.

Dig Deeper on Data integration

Business Analytics
Content Management