Data Fabric Architecture Delivers Instant Benefits

What is a data fabric?
In the past, organisations have attempted to address data access problems either through point-to-point integration or introduction of data hubs. Neither of those are suitable when data is highly distributed and siloed. Point-to-point integrations add exponential cost for any additional end point that needs to be connected, meaning this is a non-scalable approach. Data hubs allow for easier integration of applications and sources but exacerbate the cost and complexity to maintain quality and trust of data within the hub.

The data fabric is an emerging architecture that aims to address the data challenges arising out of a hybrid data landscape. Its fundamental idea is to strike a balance between decentralisation and globalisation by acting as the virtual connective tissue between data endpoints

Through technologies such as automation and augmentation of integration, federated governance as well as activation of metadata, a data fabric architecture enables dynamic and intelligent data orchestration across a distributed landscape, creating a network of instantly available information to power a business.

A data fabric is agnostic to deployment platforms, data processes, geographical locations and architectural approach. It facilitates the use of data as an enterprise asset. A data fabric ensures your various kinds of data can be successfully combined, accessed, and governed both efficiently and effectively.

Capabilities and principles of a data fabric
The core of the data fabric architecture is a data management platform that enables the full breadth of integrated data management capabilities including discovery, governance, curation, and orchestration.

However, a data fabric advances and evolves from traditional data management concepts such as DataOps, which only focuses on establishing practices, to increase the level of data operationalisation. It is built upon a distributed architecture and advanced technology able to address the needs that arise from extreme diversity and distribution of data assets.

A data fabric could be logically divided into four capabilities (or components):

Knowledge, insights and semantics

Provides a data marketplace and shopping experience
Automatically enriches discovered data assets with knowledge and semantics, allowing consumers to find and understand the data Unified governance and compliance

Unified governance and compliance

Allows local management and governance of metadata but supports a global unified view and policy enforcement
Automatically applies policies on data assets in accordance with global and local rules
Utilises advanced capabilities to automate data asset classification and curation
Automatically establishes queryable access routes for any cataloged assets for increased activation of data

Intelligent integration

Accelerates a data engineer’s tasks through automated flow and pipeline creation across distributed data sources
Enables self-service ingestion and data access over any data with local and global deep enforcement of data protection policies
Automatically determines best fit execution through optimised workload distribution and self-tuning and correction of schema drifts

Orchestration and lifecycle

Enables the composition, testing, operation and monitoring of data pipelines
Infuses AI capabilities in the data lifecycle to automate tasks, self-tune, self-heal and detect source data changes, all of which facilitates automated updates

Business benefits of a data fabric
Data only delivers business value when it is contextualised and becomes accessible by any user or application in the organisation. When implemented correctly, a data fabric helps ensure those values are available throughout the organisation in the most efficient and automated way possible. As such, the fabric has three key benefits:

Enable self-service data consumption and collaboration.
Automate governance, protection and security; enabled by active metadata.
Automate data engineering tasks and augment data integration across hybrid cloud resources.

Enable self-service data consumption and collaboration
By integrating data from multiple sources and analysing a larger fraction of the enormous amount of data generated daily, organisations gain better insights and respond more quickly to changing business demands. A data fabric rapidly delivers data into the hands of those who need it. Self-service enables the organisation as a whole to find appropriate data quicker and spend more time using that data to provide tangible insights.

Benefits of data fabric for self-service data consumption:

Business users have a single point of access to find, understand, shape and consume data throughout the organisation.
A centralised data governance and lineage help users understand what the data means, where it comes from, and how it is related to other assets.
Extensive and customisable metadata management scales easily and is accessible via APIs.
Self-service access to trusted and governed data enables line-of-business collaboration with other users.

Automate governance, data protection and security; enabled by active metadata
A distributed active governance layer for all data initiatives reduces compliance and regulatory risks by providing trust and transparency. It enables automatic policy enforcement for any data access, providing a high level of data protection and compliance.

Utilising AI and machine learning technologies allows data fabric users to increase their level of automation, for example automatically extracting data governance rules based on language and definitions in regulatory documents. This allows organisations to apply industry specific governance rules in a matter of minutes to help avoid costly fines and ensure ethical use of data wherever it resides.

Benefits of a data fabric for governed virtualisation:

Agility, security, and productivity is increased for data engineers, data scientists, and business analysts.
Multiple global data sources appear as one database.
New, industry-leading discovery of personally identifiable information (PII) and critical data elements is possible at massive scale.

Automate data engineering tasks and augment data integration
Advanced data engineering means that virtually any data access or delivery process is automated and not requiring any tedious or error prone coding process. Augmentation of integration utilises metadata data to optimise the data delivery and access.

Benefits of a data fabric for data engineering and integration:

Automatically optimised data integration helps accelerate data delivery.
Automatic workload balancing, and elastic scaling means jobs are ready for any environment and any data volume.
Resiliency and CI/CD automation are built in.
The automated process for capturing changes in real time supports delivery of quality data for business processes.
Machine learning can automate and extend custom data discovery, classification and curation processes, leading to faster time-to-value.
Continuous analysis can be automatically performed in real time, wherever data lives.

With a data fabric built on IBM Cloud Pak for Data technology, you can hyper-automate data discovery, data governance, and data consumption in a hybrid and multicloud data landscape. Employ a data fabric to enable faster time-to-value for business users, higher productivity for data engineering and operations, and greater governance and compliance fidelity.

Begin your Data Fabric journey with IBM here

Shutterstock