What are data silos?
A data silo is a repository of data that's controlled by one department or business unit and isolated from the rest of an organization, much like grass and grain in a farm silo are closed off from outside elements. Siloed data typically is stored in a standalone system and often is incompatible with other data sets. That makes it hard for users in other parts of the organization to access and use the data.
Data silos can have technical, organizational or cultural roots. They tend to arise naturally in large companies because separate business units may operate independently and have their own goals, priorities and IT budgets. But any organization can end up with data silos if it doesn't have a well-planned data management strategy.
Why are data silos a problem?
Data silos hinder business operations and the data analytics initiatives that support them. Silos limit the ability of executives to use data to manage business processes and make informed business decisions. They also prevent call center agents, sales reps and other operational workers from accessing relevant data about customers, products, supply chains and more.
The specific ways that data silos can harm an organization include the following:
- Incomplete data sets. Data silos lock data away from users who can't access them. As a result, business strategies and decisions aren't based on all of the available data, which can lead to flawed decision-making. Silos can also derail efforts to build data warehouses and data lakes that integrate different data sets for business intelligence (BI) and analytics applications.
- Inconsistent data. Many data silos aren't consistent with other data sets. For example, a marketing team may format customer data differently than other departments. Data errors by a sales team may not be identified and fixed. Data updates in other systems don't get made in a siloed customer service one. Such inconsistencies create data quality, accuracy and integrity issues that affect end users in both operational and analytics applications.
- Duplicate data platforms and processes. Data silos add to IT costs by increasing the number of servers and storage devices an organization needs to buy. In many cases, those systems are also deployed and managed separately by departments instead of an organization's data management team. That further increases spending and inefficient use of IT resources.
- Less collaboration between end users. Isolated data sets in silos reduce the opportunities for data sharing and collaboration between users in different departments. It's harder to work together effectively when people don't have visibility into siloed data.
- A silo mentality in departments. Data silos contribute to organizational silos: departments and business units that guard their data closely and are reluctant to share it with others. They may also resist data governance programs that aim to break down data silos and ensure that data is consistent and correct across all of an organization's systems.
- Data security and regulatory compliance issues. Some data silos are stored by individual users in Excel spreadsheets or online business tools like Google Drive, often on mobile devices. That increases data security and privacy risks for organizations if they don't have suitable controls. Silos also complicate efforts to comply with data privacy and protection laws.
How data silos occur
A department or end user may go rogue and create a data silo even in an organization that has solid data management processes. More often, though, data silos are a consequence of how organizations are structured and managed as a whole, including their IT operations. The following factors commonly cause silos to occur:
This article is part of
- IT strategy and technology deployments. Some organizations have decentralized IT buying decisions and allow departments and business units to purchase technologies on their own. This often leads to the deployment of databases and business applications that aren't compatible with or connected to other systems. The same thing can happen when corporate IT teams are involved in purchasing decisions if a department needs a particular technology. The variety of data platforms now available also helps drive data silos: In addition to mainstream relational databases, organizations can deploy big data platforms, NoSQL databases, cloud object storage services and special-purpose databases to meet different business needs.
- Organizational structure and management. Data silos regularly occur when business units are fully decentralized and managed as separate entities. That's most common in large organizations with different subsidiaries and operating companies, but it can happen in smaller ones with a similar structure and management approach.
- Corporate culture and principles. Even when IT and business operations are managed in a more unified way, company culture can spur the creation of data silos. There are fewer incentives to avoid them if data sharing isn't a cultural norm and an organization doesn't have common goals and principles for managing data. Departments may also view their data as an asset that they own and control, further encouraging data silo development.
- Business growth and acquisitions. Growing organizations are prone to data silos. As a company expands, new business needs may have to be addressed quickly and additional business units may be created. Both of those situations are natural data silo incubators. Mergers and acquisitions also bring silos into an organization, some known and some that may be hidden.
How do you identify data silos?
Because of their disconnected nature, data silos can be hard to detect. Ideally, IT and data management teams will create an inventory of the systems in their organizations and regularly update it to add new ones. Doing so should help identify and document data silos. But finding them all may be a challenge, especially in large organizations with business units that operate autonomously.
Evidence of data silos may come to light, though. Signs that point to them include:
- different departments reporting inconsistent data;
- BI and data science teams not being able to find or access relevant data;
- executives complaining about a lack of data on some business operations;
- end users discovering that data sets are incomplete or out of date; and
- unexpected, out-of-budget IT costs suddenly materializing.
How do you break down data silos?
Eliminating data silos enables an organization to manage and use data more effectively. It often also helps lower technology and data management costs. The following approaches can be used separately or in tandem to remove silos and connect data assets to better support business operations:
- Data integration. Integrating data silos with other systems is the most straightforward way to break them down. The most popular form of data integration is extract, transform and load (ETL), which extracts data from source systems, consolidates it and loads it into a target system or application. Other data integration techniques that can be used against silos include real-time integration, data virtualization and extract, load and transform, a variation on ETL.
- Data warehouses and data lakes. The most common target system in data integration jobs is a data warehouse, which stores structured transaction data for BI, analytics and reporting applications. Increasingly, organizations also build data lakes to hold sets of big data, which can include large volumes of structured, unstructured and semistructured data used in data science applications. Those two types of platforms provide centralized repositories for data from different systems, making them a natural way to address silos.
- Enterprise data management and governance. Ultimately, it's best to not only eliminate existing data silos but also prevent new ones from being created. A more comprehensive data management strategy helps achieve both those goals. For example, data architecture design documents data assets, maps data flows and creates a blueprint for data platform deployments. An enterprise data strategy better aligns the data management process with business operations. And a strong data governance program can directly reduce the number of data silos in an organization and promote common data standards and policies.
- Culture change. To really put a stop to data silos, it may be necessary to change an organization's culture. Efforts to do so can be part of the data strategy development process or a data governance initiative. In some cases, a change management program may be needed to implement the cultural changes and ensure that departments and business units adopt them.
What are the business costs of data silos?
The financial cost of data silos depends on the organization: how many silos it has, how successful efforts to eliminate them are, whether they continue to proliferate. In general, increased IT and data management expenses are the most tangible cost. But data silos also have various hidden costs, including:
- reduced productivity;
- less effective business management;
- missed business opportunities;
- lower-quality customer service; and
- a lack of trust in data that limits its use and its business benefits.
The terms data silo and information silo are sometimes used as synonyms. More often, though, information silos are considered to be a cultural problem caused by departments or individual workers who don't want to share information. In addition to cultural change, one way to address the latter problem is to create an information architecture along with a data architecture.