In this excerpt from Master Data Management and Data Governance, readers will learn how master data management (MDM) and SOA – or service-oriented architecture – relate. Readers will also find an introduction to SOA and learn about the benefits of SOA.
Table of Contents
- The evolution of MDM architecture
- An introduction to enterprise architecture framework and MDM patterns
- MDM and SOA: An introduction to SOA and the benefits of SOA
- MDM design, MDM deployment options and MDM hierarchy
MDM Architecture Viewpoints
Because of its broad coverage of the business-to-technology dimensions, an architecture framework can help organize and promote different points of view for an enterprise. Different groups within the organization may express these points of view based on their organizational affiliation, skill sets, and even the political landscape of the workplace. Because a full-function MDM solution tends to be truly an enterprise-scale initiative that spans organizational and lines-of-business boundaries, one benefit of using the framework approach is to help gain organizational buy-in and support for expensive and lengthy MDM projects.
Of course, we do not want to create an impression that any MDM solution has to be architected using Zachman’s framework. In fact, very few enterprise-wide initiatives use this framework in its entirety with all its 30 viewpoints. Many architecture-savvy organizations use a subset of the complete enterprise architecture framework or different architecture viewpoints. The goal of the preceding discussion was simply to illustrate the principles and benefits of the enterprise architecture framework and patterns approach as a way to solve the design and implementation challenges of any large and complex software system.
We would like to use the principles of the architecture framework to define the most relevant architecture viewpoints for a successful design and implementation of an MDM solution, with a specific emphasis on the MDM Data Hub implementations. In this context, we will focus the framework viewpoints discussion on the conceptual and logical levels of the architecture, and shall consider the following set of architecture viewpoints:
- Architecture viewpoints for various classification dimensions, in particular the consumption and reconciliation dimension and the use pattern dimension
- Conceptual architecture
- High-level reference architecture
- Services architecture
- Data architecture
From the framework perspective, we recognize many different but equally important architecture viewpoints. However, because describing a complete framework set is beyond the scope of this book, we’ll focus the follow-on discussion in this chapter on three viewpoints: the services view, architecture views of MDM classification dimensions (we introduced this topic in Chapter 1), and the reference architecture view. We discuss additional architecture details and specific data architecture views in Chapters 5, 6, and 7, whereas data security and visibility architecture views are discussed in Chapter 11.
Services Architecture View
A services architecture viewpoint is probably one of the most relevant to the architecture discussion of the MDM system. Indeed, we have stated repeatedly that an MDM system should be an instance of the service-oriented architecture (SOA). Using this viewpoint has an additional benefit in that it helps us illustrate how we can extend the very approach of the enterprise architecture framework to describe complex systems such as MDM systems. Indeed, even though Zachman’s framework does not explicitly show a services architecture viewpoint, we will define such a viewpoint for a Data Hub system and show how this viewpoint can be mapped to Zachman’s framework.
Introduction to Service-Oriented Architecture
We define service-oriented architecture (SOA) as an architecture in which software components can be exposed as loosely-coupled, coarse-grained, reusable services that can be integrated with each other and invoked by different applications for different purposes through a variety of platform-independent service interfaces available via standard network protocols.
This is a practical definition but not the only valid definition of SOA. There are a number of alternative definitions of SOA, and it’s beyond the scope of this book to described them all or offer arguments about the merits of individual definitions. Therefore, we should consider a standard bearer in the SOA space. The World Wide Web Consortium (W3C) has developed a comprehensive definition of the service-oriented architecture in its February 2004 Working Group publication.
W3C Definition of Service-Oriented Architecture
A service-oriented architecture (SOA)5 is a form of distributed systems architecture that is typically characterized by the following properties:
- Logical view. The service is an abstracted, logical view of actual programs, databases, business processes, and so on, defined in terms of what it does, typically carrying out a business-level operation.
- Message orientation. The service is formally defined in terms of the messages exchanged between provider agents and requester agents, and not the properties of the agents themselves.
- Description orientation. A service is described by machine-processable metadata.
- Granularity. Services tend to use a small number of operations with relatively large and complex messages.
- Network orientation. Services tend to be oriented toward use over a network, although this is not an absolute requirement.
- Platform-neutral. Messages are sent in a platform-neutral, standardized format delivered through the interfaces.
Similar to the architecture framework discussion, we can define SOA in a way that recognizes multiple views of service orientation and clearly relies on the messaging paradigm implemented over a network. Moreover, because services are composed from service components, and can be organized to work together to perform a given task, we need to introduce two additional concepts: service orchestration and service choreography. These concepts are key for the notion of service management. There are numerous, often conflicting definitions of these terms. We offer here one definition set as a reference. Readers interested in this subject can review other definitions available on the Web.
SOA and Service Management: Orchestration and Choreography
Orchestration refers to the automated execution of a workflow. An orchestrated workflow is typically exposed as a set of services that can be invoked through an API. It does not describe a coordinated set of interactions between two or more parties. Choreography refers to a description of coordinated interactions between two or more parties.
The definition of SOA and its key concepts help define a services view of the MDM system in a way that makes it clear which services, functions, and components need to be considered and included for a full-function MDM SOA implementation. We discuss this point in more detail later in this chapter.
In addition to the regular SOA viewpoint, we can also show that the service-oriented architecture can be mapped to the viewpoints of an enterprise architecture framework. Specifically, consider that SOA is not a specific technology or product. Rather, it can be described as a design philosophy for the application architecture portion of the framework.
If we use the SOA definition to represent information technology assets as services, then SOA can be mapped to the framework at the Logical level within the Function domain.
We can logically extend this approach to show that the set of functional services represents business processes, and because SOA is based on the network-aware messaging paradigm, the notion of the service orientation can be realized in several architecture framework viewpoints that connect process models and network-based messaging.
We offer these considerations simply to demonstrate that the framework approach and service-oriented architecture are closely connected and continuously evolving concepts that together can be used to help describe and plan the design and implementation of complex systems such as Master Data Management.
Additional insights into the SOA include the following key principal benefits:
- SOA offers access mechanisms to the application logic as a service to users and other applications where:
- Service interfaces are independent of user interfaces.
- Services are business-process-oriented.
- Business-level services are coarse-grained and can be easily mapped to business functions.
- Coarse-grained services can be combined or assembled from lower-level, finegrained service primitives at run time.
- Services are published in a standard fashion for discovery and execution.
- Services can be used and reused by existing applications and systems.
- SOA permits the construction of scalable applications over the network.
- SOA supports asynchronous communications.
- SOA supports application-level conversations as well as process and state management.
SOA can significantly simplify and accelerate the development of new applications by invoking a variety of published services and organizing or orchestrating them to achieve the desired business functionality. Because SOA allows business-level services to be assembled at run time, developers do not have to design all possible variations of services in advance. This reduces the development time and helps minimize the number of errors in the application code.
One of the benefits of SOA is its ability to leverage the power and flexibility of Web Services across the enterprise by building loosely-coupled, standards-based applications that produce and consume services.
Introduction to Web Services
Web Services is another important concept that enables a shift in distributed computing toward loosely-coupled, standards-based, service-oriented architectures that help achieve better cross-business integration, improved efficiency, and closer customer relationships.
The short definition of Web Services offered here states that Web Services are encapsulated, loosely-coupled, contracted software objects that are published and consumed using standard interfaces and protocols.
The true power of Web Services lies in three related concepts that describe how Web Services change the fundamental nature of distributed computing:
- Web Services offer a standard way of supporting both synchronous and asynchronous messages – a capability essential to perform long-running B2B transactions.
- Web Services are loosely coupled, enabling a reduction in the integration costs as well as facilitating a federation of systems.
- Web Services support coarse granularity of the application programming interfaces (APIs). A coarse-grained interface rolls up the functions of many different API calls into a small number of business-oriented messages – a key to business process management and automation.
A good discussion on Web Services, SOA, and Web Services Architecture (WSA) can be found in the W3C Architecture documents. For simplicity, we’ll define Web Services as encapsulated, loosely-coupled, contracted software objects that are published and consumed using standard interfaces and protocols.
Web Services are encapsulated, loosely-coupled, contracted software objects that are published and consumed using standard interfaces and protocols.
A high-level view of a service-oriented architecture is shown in Figure 4-3.
Another, more structured view of the service-oriented reference architecture has been developed by a standards organization called the Organization for the Advancement of Structured Information Standards (OASIS). One of the OASIS SOA reference architecture views is depicted in Figure 4-4.
SOA and Web Services are rapidly evolving from intra-enterprise usage to inter-enterprise communities of interest to general-purpose business-to-business environments, thus enabling significant reductions in the cost of integration among established business partners. SOA and Web Services have changed the way companies do business. For example, business transactions that use Web Services can offer new per-use or subscription-based revenue opportunities by exposing value-added services via public, Internet-accessible directories.
Combined with the benefits of the entity-centric transformations offered by the MDM Data Hub solutions, Web Services and SOA are powerful tools that can have a direct and positive impact on the design, implementation, and benefits of new entity-centric business strategies.
MDM and SOA
MDM is a direct beneficiary and at the same time an enabler of the service-oriented approach and Web Services. Indeed, MDM’s complexity and variability of features and options all benefit from the ability to “assemble” or compose an MDM system from a pallet of available services by leveraging service reusability, service granularity, and loose coupling. SOA and Web Services by their very nature promote standards compliance as well as service provisioning and monitoring.
Moreover, SOA requires and enables service identification and categorization – features that represent a natural affinity to the capabilities of the MDM platform. Services categorization by itself is a valuable concept, because it helps define and understand service taxonomy, which in turn can guide architects and designers to the most effective placement and composition of services and their interdependencies. We show how these capabilities can be mapped onto the MDM system’s services view later in this chapter.
In other words, there are significant synergies between MDM and SOA. At a high level, these synergies can be summarized as follows:
- SOA defines a fabric that helps deliver operational and analytical master data from the MDM system to all business application systems and users.
- MDM is a core engine of SOA master data services (MDS) and uses SOA components and principles to make master data available to its applications and users via services.
These synergies between MDM and SOA are not automatic. It is important to understand that SOA programs aimed at Master Data Management sometimes fail because the enterprise group responsible for the services framework does not align the SOA strategy, framework, and components with the enterprise data strategy and specifically the MDM strategy. MDM, with its cross-functional context, is a perfect area of application for SOA. When an SOA does not support MDM data services, the value of the SOA, even if it is implemented well from the technology perspective, is marginal.
Applying SOA principles to MDM solutions, we can construct a high-level service-oriented view of the MDM Data Hub (see Figure 4-5). Here, the Data Hub acts as a services platform that supports two major groups of services: internal, infrastructure-type services that maintain Data Hub data integrity and enable necessary functionality; and external, published services. The latter category of services maps well to the business functions that can leverage the MDM Data Hub. These services are often considered business services, and the Data Hub exposes these external business services for consumption by the users and applications.
As we stated in the section on defining MDM architectural philosophy, we can organize all the services into a layered framework, with the services consumers on the top requesting and using the coarse-grained business services on the second layer. These published, business-level services invoke appropriate internal, fine-grained services in the layer(s) below. In this context, Data Hub internal services enable data access and maintain data integrity, consistency, security, and availability. The internal services interact with the Data Hub as a data service provider, and potentially with other data stores for the purpose of data acquisition, synchronization, and delivery.
The services invoke executable components and implement methods that perform requested actions. Following the principles of the service-oriented architecture and Web Services, the lower-level Data Hub services can be combined to form more coarse-grained, composite, business-level services that execute business transactions. In general, the service-oriented nature of the Data Hub platform would allow this service assembly to take place at run time. In this case, a Data Hub would help establish an appropriate execution environment, including the support for transactional semantics, orchestration/choreography, composition and remediation of failed actions, and security services. A full-function MDM Data Hub system would deliver these features through a dedicated set of internal services and functional components.
An important benefit of the SOA approach is that it allows for significant flexibility in the way the MDM services are selected and delivered, and the SOA approach does not require or imply that all services have be a part of a single product. In fact, SOA allows MDM designers to select best-in-class solutions for the services. For example, an organization may select an MDM vendor product implemented as a service platform but may decide to use an entity matching engine and data integration services from another vendor based on features, price points, existing inventory, familiarity with the product, and a host of other reasons.
Although many MDM vendor products support service-oriented functionality, support for the scalability, flexibility, and granularity of the business services varies significantly from product to product.
We discuss additional details of the Data Hub as an instance of a service-oriented architecture in Chapter 6. We describe the Data Hub components view in the context of the reference architecture later in this chapter.
MDM and SOA Misconceptions
One of the key differences between modern services-oriented MDM solutions and their ODS predecessors is that an MDM Data Hub is much more a service platform than just a data repository. The term “MDM Data Hub” is often inaccurately used to mean the same thing as the more traditional operational data stores of the 1980s and 1990s.9 Misusing the term adversely affects understanding of the modern design options of Enterprise Data Management (EDM) and MDM solutions that are enabled by MDM Data Hub systems.
There are some key characteristics and features of Data Hubs that often are underestimated or misunderstood by enterprise architects and systems integrators. Here are two of the most common misconceptions:
• Misconception 1: An MDM Data Hub is just another data repository or a database used for storage of cleansed data content, often used to build data warehousing dimensions.
Indeed, data must be cleansed and standardized before it is loaded into the Data Hub. For many professionals brought up on the concepts of operational data stores, data warehouses, and ETL (Extract, Transform, and Load), this is an undisputable truth. But it’s not the only concern of Data Hub data content. Modern MDM architectures support a much more active approach to data than just the storage of a golden record. The Data Hub makes the best decisions on entity and relationship resolution by arbitrating the content of data created or modified in the source systems. Expressed differently, a Data Hub operates as a master data service responsible for the creation and maintenance of master entities and relationships.
The concept of a Data Hub as the enterprise master data service (MDS) applies the power of advanced algorithms and human input to resolve entities and relationships in real time. In addition, data governance policies and data stewardship rules and procedures define and mandate the behavior of the master data service, including the release of reference codes and code translation semantics for enterprise use.
The services nature of the MDM Data Hub provides an ideal way for managing data within an SOA environment. Using a hub-and-spoke model, the MDS serves as the integration method to communicate between all systems that produce or consume master data. The MDS is the hub, and all systems communicate directly with it using SOA principles.
Participating systems are operating in a loosely-coupled, federated fashion and are “autonomous” in SOA parlance, meaning that they can stay independent of one another and do not have to know the details of how other systems manage master data. This allows disparate system-specific schemas and internal business rules to be hidden, which greatly reduces tight coupling and the overall brittleness of the MDM ecosystem. It also helps to reduce the overall workload that participating systems must bear to manage master data.
• Misconception 2: The system of record must be persisted in the MDM Data Hub.
The notion of a Data Hub as a data repository erroneously presumes that the single version of the truth, the golden record, must be persisted in the Data Hub. The notion of the MDM Data Hub as a service platform does not make this resumption. Indeed, as soon as the master data service can deliver master data to the enterprise, the Data Hub may persist the golden record or assemble it dynamically instead.
One of the arguments for persistently stored master data in the Data Hub is that performance for master data retrieval will suffer if the record is assembled dynamically on request. The reality is that the existing Data Hub solutions have demonstrated that dynamic master data content can be assembled with practically no performance impact if the master data model is properly implemented.
One of the advantages of dynamically assembled records is that the Data Hub can maintain multiple views of the master data aligned with line-of-business and functional requirements, data visibility requirements, tolerance to false positives and negatives, and latency requirements. Mature enterprises increasingly require multiple views for the golden record, and the dynamic record assembly works better to support this need.
Conversely, we can offer a strong argument in favor of a persistently stored master data records. This argument is driven by the need to support the history and auditability of the master data life cycle. There are at least two major usage patterns for the history of master data:
• The first pattern is driven by audit requirements. The enterprise needs to be able to understand the origin, the time, and possibly the reason for a change. These audit and compliance requirements have to be supported by the Data Hub at the attribute level. MDM solutions that maintain the golden record can dynamically address this need by supporting the history of changes in the source system’s record content.
• The second usage pattern for history support results from the need to support database queries on data referring to a certain point in time or certain time range – for example, what was the inventory on a certain date, or sales over the second quarter? A classic example of this type of history support is the implementation and management of slowly changing dimensions in data warehousing. In order to support this usage pattern, the golden version of the master record must be persisted. It is just a question of location. Many enterprises decide this question in favor of data warehousing dimensions while avoiding the persistently stored golden record in the Data Hub.
In short, modern MDM Data Hub systems function as active components of service-oriented architecture by providing master data services, rather than being passive repositories of cleansed data. This consideration should help the enterprise architects and systems integrators build sound Master Data Management solutions. An additional discussion about these differences can be found in Chapter 6.
Excerpted from Master Data Management and Data Governance, 2nd Edition, by Alex Berson and Larry Dubov (McGraw-Hill, 2010) with permission from McGraw-Hill.
More about this book and others like it...
- Intrigued by this chapter excerpt? Download a free PDF of the entire chapter: MDM Architectural Considerations
- Read more excerpts and download more sample chapters from our Data Management bookshelf
- To purchase the book or similar titles, visit the McGraw-Hill website.