One of the leading causes of data quality problems is siloed data. It produces duplicate data and, in many cases, different understandings of important business processes. That's why it's important to make breaking down data silos a central component of any data quality initiative.
What is a data silo?
A data silo is a collection of information that isn't effectively shared across an organization. Although the business unit that owns the siloed system and perhaps some more entities know the data exists, other parts of the organization are unaware of its availability and its usage by those in the know.
Some of the most common causes of data silos are:
Lack of organizational oversight. The organization isn't effectively managing data as a corporate asset. For example, there's no entity that governs data and how it's used at the enterprise level.
Business growth. During growth periods, organizations often add new operational units to better focus on key aspects of their business. The units create or acquire data elements that meet what they think are their own unique business needs. As the number of units expands, so do the data silos.
This article is part of
Cultural issues. Business units often see data they control as an asset that increases their importance to the organization. A less dysfunctional reason for one to keep data siloed is if it feels that sharing the information may lead to a loss of control over data quality or the need to make unwanted changes to its systems if data definitions are modified to meet enterprise needs.
Technical reasons. The increasing popularity of third-party and, more specifically, SaaS applications is accelerating data silo growth. That's because many of these applications store information in vendor-supplied cloud data silos.
Why are data silos bad?
If you have been in IT for any length of time, there's a good chance you understand the problems data silos cause, including:
- No single source of truth to rely on for high-quality decision-making
- Duplication of data in systems
- Duplicate efforts to create and manage the same information
- The inability to build a 360-degree view of a customer, partner or product
- Not being able to identify enterprise-wide business trends
- Different definitions for the same data elements in separate systems, leading to inconsistencies and poor data quality
Data governance and breaking down data silos
Reducing the number of data silos and their negative impact will require that you develop both prevention and corrective-action strategies. Here are some steps you should take to do so.
First, build a culture of viewing all corporate data as an enterprise asset. Promote the benefits of good data quality, data sharing and a single source of truth across IT and business units.
Next, assign ownership of enterprise-wide data governance to a person or team, depending on your organization's size. Give them the responsibility and authority for governing all data assets and managing data quality, which includes consolidating data silos and building data-sharing procedures. Their role should include helping developers determine if the information they need already exists and enforcing where new data elements will be stored.
Once you identify a data silo, meet with the owner, evaluate the data's usage and determine if the silo should be consolidated, replaced or treated as a system of record for the organization.
If you don't already have one, evaluate the need for a data warehouse environment to consolidate data from multiple operational sources. Data warehouses help you to tightly govern and easily share the information you are duplicating from operational systems.
Also, evaluate master data management software and data quality tools to facilitate effective data governance and help you build a single source of truth. The alternatives range from open source offerings like Talend to commercial platforms from IBM, Informatica, Information Builders, Oracle, SAP, SAS and other vendors.
Building a data governance policy
A data governance policy outlines the roles, rules, processes and best practices that an organization follows to ensure the quality and proper use of its data; it also can assist in breaking down data silos. A suggested outline includes a statement of purpose that defines the governance policy's mission and goals, backed up by executive sponsor signatures. Additionally, a policy should cover the following aspects of a data governance program:
Structure. An organizational structure is necessary, including senior management sponsors, a steering committee that sets data standards and rules, team members responsible for enterprise-level data quality, and data stewards who assist in the governance of departmental information. Each role should include their governance responsibilities, activities and authority.
Data creation. There should be policies that control the ownership, storage and definition of new data elements. These policies should also include security and regulatory framework classifications, access methods and data auditing, retention, backup and archiving measures.
Data access. Data request procedures should cover security and regulatory adherence reviews, read vs. update, data access procedures and tools, and performance impact metrics.
Data usage. Write an ethical code-of-conduct for data that includes strictures on misuse, unauthorized changes, deliberate falsification and intentional destruction.
Data integrity. Instate procedures that protect the quality and lineage of existing data, including guidelines on modifying data definitions and updating data elements.
Data correction. Define procedures outlining the steps to identify, correct and determine the root cause of poor-quality data.
Data sharing. Best practices for organizational data sharing include providing controls for interdepartmental data modifications, provisions for identification and consolidation of data silos, and shared data store creation and usage guidelines.