Three factors drive modernization of data management approaches: information scale, value and risk. Organizations are dealing with data sets that are orders of magnitude larger than in the past. The increased data volumes and variety of sophisticated analytics tools enable more users to access and monetize data. But that poses greater risks for data misuse or exposure.
It's increasingly hard to get a handle on enterprise data management processes without a comprehensive data architecture.
What is data architecture?
DAMA International's Guide to the Data Management Body of Knowledge suggests that data architecture "includes specifications used to describe existing state, define data requirements, guide data integration and control data assets as put forth in a data strategy." The scope of an enterprise's data architecture in most any industry blends strategies, tactics, data organization, platforms and tools for managing the end-to-end data lifecycles that inform and drive operational processes and analytical decision-making.
The data architecture primarily guides how an organization manages the enterprise data landscape across the following five domains:
- Infrastructure includes platform choices such as technical platforms in on-premises data centers, data models, cloud platforms and associated data management services, and network configurations. This domain also focuses on data storage frameworks and databases, such as relational database management systems, NoSQL database systems, text and comma-separated values files, as well as semistructured and unstructured data managed in object storage frameworks.
- Management includes data selection (what data sets are created within the organization and which data sets are acquired from outside the enterprise), metadata management capturing information about data object structure and semantics, and data catalogs that provide a searchable inventory of data assets.
- Processing includes data onboarding, ingestion, and configuration and execution of data pipelines that integrate, validate and transform data into the formats used by the business applications.
- Governance combines oversight and stewardship to manage compliance with standards, models, rules and defined policies governing organizational data collection, management and consumption with data protection techniques, including perimeter security precautions, encryption methods, and role-based and attribute-based access controls.
- Utilization embraces identifying the different data consumer communities, assessing their requirements and supporting their usage scenarios through access such as direct querying, extracts and data services. It also includes methods for data organization for reporting and analytical purposes like the use of a data warehouse and end-user visualization tools of data analysis and presentation.
How a solid data architecture can benefit businesses
Not surprisingly, organizations typically make tactical decisions about data architecture. These decisions are often driven by the need to determine a particular application's data requirements or by a set of data extracts rapidly configured to support the creation of a report. Senior leaders are motivated to devise a data architecture when the organization reaches a level of technical maturity, recognizing the value of information and how different personas and lines of business can benefit through governed data sharing and management.
At a macro level, a solid data architecture benefits an enterprise in several ways.
Increased data awareness. Siloed data development not only leads to duplicated efforts, but it also effectively prevents the organization from fusing data assets and producing actionable business intelligence. This problem is largely attributable to the lack of data awareness. Individuals who potentially could benefit are not even aware that the data sets they need might already be available. An enterprise with a solid data architecture catalogs its data assets, exposes them to the entire organization and raises data awareness.
This article is part of
Improved accessibility. A holistic view of an enterprise data architecture allows for streamlined processing that simplifies data access, speeds data delivery and project development, and expands data availability.
Cost efficiency. Better organized information helps identify redundant storage and duplicated technology investments and development. Reducing data replication also helps reduce platform demands and sets the stage for assessing where on-premises hardware capital investments can be mitigated by migrating to cloud platforms -- allowing the company to scale its operations expenses according to actual processing demands.
Improved trustworthiness. Data quality can be ensured through compliance with models and standards and validation and assurance of integrated data.
Key steps for creating a data architecture plan
Developing a full-scale enterprise data architecture starts with several important steps that data architects must follow when devising a solid data architecture plan.
1. Socialize with senior leaders
As with any strategic technology initiative, the value of developing a data architecture must be effectively articulated and continually communicated to C-suite executives. Craft a message that demonstrates the benefits a data architecture brings to the enterprise. Identify and engage key stakeholders to gain their support.
2. Identify the data personas
An organization's technology environment is driven by the information needs of data consumers. Application system custodians are accountable for the data sets their applications produce and use. Ascertain the people who create, store, update, read and otherwise touch data within the enterprise. Identify stereotypical personas and characterize them according to their data touchpoints.
3. Determine information requirements
Engage the data consumers to understand their business strategy and solicit their business requirements for data. Document how those requirements relate to the abstract data domains, such as "customer" or "product" data, and the discrete data sets these consumers currently use or anticipate needing.
4. Evaluate information risks
Identify and interpret data governance directives and how they relate to the handling, management and protection of data.
5. Assess the data landscape
Survey and document the name, location, owner, producer, consumers and contents of enterprise data sets. Classify each data set according to usage scenarios and sensitivity and collect this information in a data catalog.
6. Analyze the data lifecycles
Evaluate how data sets flow from their origination points to their final destinations. Document the data lineage mapping of data pipelines.
7. Appraise the data infrastructure
Document the current state of data management in the enterprise and capture the current technology infrastructure -- what systems, database structures, data warehouses, data marts and operational data stores are used, whether they're on premises or in the cloud and, if the latter, the cloud service providers.
8. Do a SWOT analysis
Synthesize the knowledge that has been collected and analyze the strengths, weaknesses, opportunities and threats. Identify the greatest opportunities for improvement.
9. Create a blueprint and roadmap
Devise a blueprint for framing the enterprise data architecture that summarizes the collected knowledge and highlights proposed deployment projects. Scope out a roadmap for the proposed projects across the near-, medium- and longer-term horizons.