kentoh - Fotolia


How to manage unstructured data using an ECM system

Enterprise content management systems are the most common system in which businesses manage unstructured data, which includes documents, spreadsheets, and audio and video files.

Businesses generate an enormous amount of information, which can be unwieldy if they don't manage it properly.

Companies create both structured and unstructured data, and manage it in a number of different places. Structured data is typically found in CRM systems and ERP platforms, while unstructured data is a component of knowledge bases, digital asset management systems and enterprise content management systems.

The ECM system, however, is the best -- and most common -- place in which to manage and store unstructured data.

The differences between structured and unstructured data

Structured data is information governed by a database structure, organized into defined fields, usually within the context of a relational database. The database structure requires that data in the fields follow a prescribed format. For example, a date must have the format of a date and a name must be limited in length. The most common place that people encounter structured data is in the cells of a spreadsheet.

Structured data has many applications within businesses and is easy to search. It is found in finance, customer relationship management, supply chain and other applications where compliance to structures is keyed to business tasks.

Unstructured data, on the other hand, is data without rules and is not as searchable. Users who create unstructured data are writing free-form, rather than complying with structured data fields. There is minimal enforcement of any rules on the length of content, the format of the content or what content goes where. Despite the lack of formal structure, unstructured information -- which users create in word processing programs, spreadsheets, presentation files, PDFs, social media feeds, and audio and video files -- forms the bulk of the data created in an organization. Unstructured data might contain engineering design specifications, marketing plans or a summary of financial performance.

Unstructured data works best in an ECM system -- especially one that is content services-aware.

Some businesses choose to put this unstructured content in a simple file-sharing system, but doing so limits its access and searchability. Instead, unstructured data works best in an ECM system -- especially one that is content services-aware -- so search integration, data mining, text analytics and other tools can help users discover, use and collaborate on the content within.

How to manage and access unstructured data using an ECM

An enterprise content management system organizes unstructured content for access and searchability. By storing unstructured data in association with metadata about it, an ECM system provides some explanation of what is inside.

Here are some functions and tools within ECM system that help manage unstructured data:

  • Automation. This can expose metadata that is embedded in the unstructured data binaries, such as content author, content modification date and content title.
  • AI scanning. Artificial intelligence can apply data analytics across a whole repository of data to determine important trends, comparing each piece of unstructured data to all of the unstructured content the AI has seen. It can also apply business rules or algorithms to the data to understand how the content performs in marketing or in other uses.

There are three primary ways to access unstructured data in an enterprise content management system:

  • Metadata. The use of metadata enables unstructured content to turn up in enterprise search results based upon commonly defined characteristics. Metadata on a bank statement, for example, might include the date of the statement and an account number that the statement represents. Businesses users can either tag content manually or automatically -- using a software such as Microsoft SharePoint.
  • Direct link. A bank, for example, may have direct links to common terms and conditions from within its website, linking back to its content management system.
  • Full-text search. Full-text search enables users to search for all credit card statements with a charge to Barnes & Noble, for example. That search would use metadata to search only statements and then use full text to find the statements in question.

Structured and unstructured data do not work in a vacuum. They need to be used in the correct business context to be of value and they depend on the metadata assigned to them. For example, someone may apply for a mortgage and provide the bank with a number of documents, including tax returns, bank statements, paycheck stubs and credit reports. In this case, the context of these documents exists in two forms -- the first is the mortgage application and the second is the applicant. But if that same applicant applies for a car loan down the road with that same bank, the bank can use those same documents in a third context -- a car loan application.

Next Steps

How to create an ECM roadmap

Dig Deeper on Content management software and services

Business Analytics
Data Management