What is content sprawl?
Content sprawl is a condition of an organization's content assets in which proliferation and unmanaged growth have led to an unwieldy mass that is difficult to manage. In addition, the jumble of sprawling content makes it challenging for end users to find what they need. Content sprawl can exist in a self-contained system (e.g., public website or private intranet), as well as across an organization's IT infrastructure.
A personal computer or laptop provides a convenient analogy for content sprawl. When a new computer is turned on, it's in a pristine state, with a limited set of files and applications. In the first few weeks, users know where everything can be found.
Over time, users create folders and sub-folders, download applications and generate documents, spreadsheets, videos and more. Soon enough, documents end up in both expected and unexpected locations: file folders, the desktop, the downloads folder or the computer's system folder.
Important assets are also stored as attachments in an email client, documents on a remote file server and photos on an enterprise file sync-and-share (EFSS) service. The result is content sprawl. When users need a presentation for their meeting, they don't know where the latest version is stored. They open the version sitting on their desktop, but the latest one is really in an email sent from a colleague.
What causes content sprawl?
Organizations create content sprawl in a manner similar to the previous computer example. The causes include the following:
Migration from legacy systems. Organizations may have legacy systems that are over 20 years old. While newer systems have been deployed, the legacy systems are not fully decommissioned. This may be due to lack of budget to properly shut them down or because one business process still relies on the legacy system's existing content structure. In either case, the legacy system remains available to end users. As more and more new technology is deployed, the existence of both legacy systems and new systems add to content sprawl.
Dynamic content. End users can cause content sprawl from their management of documents and files in their day-to-day work. Applications can also cause content sprawl from the generation of dynamic content, such as content that is auto-generated by the application. Examples of dynamic content that can contribute to content sprawl include the following:
- application log files;
- auto-generated reports;
- author profiles, comment threads and forum posts;
- temporary copies of documents that remain in file folders; and
- call recordings or transcripts.
Growth of shadow IT. With the bring your own device (BYOD) movement, employees increasingly look to their personal devices, such as a laptop, tablet or smartphone, to complete work-related tasks. In addition, employees store corporate data on consumer-friendly cloud storage systems, such as Google Docs, Dropbox and Box. Multiply this phenomenon across hundreds of employees, and content can now be found everywhere.
Importance of fixing content sprawl
Because of the negative business impact that can result, it's important to fix content sprawl. Examples of how content sprawl can affect an organization include the following:
It is hard to find the right document or asset. Productivity can be negatively impacted when employees spend too much time looking for a needed document. Add up the time employees needlessly search for the right document and the productivity loss can amount to thousands of dollars or more. On a website or Intranet, the quality of search results can be affected, since similar copies of the same page or document can be found. As a result, users may get confused and lose confidence in the site.
It creates challenges with version management. If documents are stored in multiple locations (e.g., a file server, EFSS service and an employee's hard drive), it's difficult to determine which one is the latest version. Users will update different versions, further compounding the content sprawl problem. Such a scenario can be costly to a company. For example, if language was added to the wrong version of a sales contract, erroneous pricing information can be sent to a client, costing the company millions of dollars in lost revenue.
How to fix content sprawl
While it may be challenging to eliminate content sprawl altogether, there are ways to keep it under control:
Consolidation and migration. Consider it the "legacy system challenge" in reverse. Instead of having new systems and legacy systems be available simultaneously, decommission the unneeded systems and migrate data off of them. After migrating the content and data off a system, ensure there are no end users or applications connecting to it before you turn it off. In addition, organizations can migrate content to offline storage, also known as cold storage. This way, the data is retained and can be retrieved if needed for compliance reasons, among others.
Content audits. Organizations can undergo content audits to determine where critical information and documents reside. An inventory can be created by asking what exists where and by documenting content owners, such as departments or business units. Content can be put through a ROT (redundant, outdated, trivial) evaluation. If it doesn't pass the test, it can be deleted or archived. Content audits are most effective when done on a regular and consistent basis.
Content federation. With content federation (also known as content integration services), content and documents are made available within applications while remaining in the original system. Users can read and write to the document without storing a local copy. As a result, a single copy of a document is always maintained, avoiding the management process required around version control.
Editor's note: This article was written by Dennis Shiao in 2020. TechTarget editors revised it in 2022 to improve the reader experience.