Introduction to data loss prevention products

Expert Bill Hayes describes how data loss prevention (DLP) products can help identify and plug information leaks and improve enterprise security.

We are living in a time when sensitive information flows seamlessly throughout organizations and out to employees across the globe. Unfortunately, this data can wind up in the hands of unintended recipients, who can then cherry-pick the data for their own profit. While the threat of malicious insiders is a valid concern, equally grave data exposures occur through poorly understood business processes that use insecure protocols and procedures, and when employees do not practice secure data handling.

To solve these problems, data loss prevention (DLP) tools help identify and plug information leaks before they negatively impact organizations.

Most organizations have some kind of classification scheme intended to identify the kinds of data they use. Once categorized, the appropriate controls can then be applied to monitor and control data access, transportation and storage. In the days when businesses stored information on paper and microfilm, controls such as printed access rosters, security guards, locked filing cabinets and combination safes prevented unauthorized access and dissemination. With data mostly reduced to digital form nowadays, companies have to use special software to detect data theft while maintaining these older security controls (as long as paper or microfilm records still exist).

DLP: Data in use, in motion, at rest

Depending on their use, DLP tools can detect and block the potential exposure of sensitive information while in use, in motion or at rest.

  • Data in use is data that is being processed, is in memory and may be present in temporary files. It poses a danger if insecure endpoint devices are processing the data or may be routing it to unapproved storage or unapproved remote locations.
  • Data in motion is data traveling across a network in a point-to-point transaction. The danger here lies in data transactions that may take sensitive information beyond the organization's perimeter or to unintended printouts or storage media.
  • Data at rest is data that is stored in digital form in persistent (not temporary) files, and can include end-user files and databases located on file servers, backup tapes, SAN storage and portable media.

Data loss prevention can ensure end users don't send sensitive information outside their organization's network or move it from secure to insecure storage. While DLP products do address the insider threat, they are also very useful as a technical control to prevent the inadvertent exposure of sensitive information by persons unfamiliar with its value or the proper way to process, transmit and store sensitive information.

How DLP works: Standalone vs. integrated

DLP products are designed to detect sensitive information as it is accessed by endpoint devices like desktops and mobile devices, as it lies dormant on a file server in forgotten documents, and as it moves through an organization's networks using any number of protocols. DLP tools address the problems of sensitive data usage, movement and storage based on an organization's understanding of what it wants to protect and where the data is allowed at any moment.

Standalone DLP products can reside on specialized appliances or can be sold as software to be installed on the enterprise's own hardware. They are specialized and only address data loss prevention. A full soup-to-nuts DLP product monitors data at rest using a file scanning engine. It also features a network appliance to monitor data in transit over a company’s network on many network protocols.

An endpoint agent detects sensitive information in memory, during printing attempts, copying to portable media or exiting through network protocols. The agents may also be able to detect sensitive information at rest by scanning files found on endpoint logical drives.

Data loss prevention can ensure end users don't send sensitive information outside their organization's network or move it from secure to insecure storage.

Standalone DLP products also provide some manner of management console, a report generator, a policy manager, a database to store significant events and a quarantine server or folder to store captured sensitive data. There is also usually a method to build custom detection policies.

Integrated DLP features, by contrast to standalone DLP, are usually found on perimeter security gateways such as Web or email security gateways, intrusion detection systems/intrusion prevention systems, endpoint security suites and unified threat management products. Depending on their main functions, these products are most useful at detecting sensitive data in motion and sensitive data in use. Vulnerability scanners, for example, usually have DLP plug-ins to detect sensitive data at rest, such as Social Security numbers.

Unlike the convenience of having a standalone DLP product, security products with integrated DLP from different vendors do not share the same management consoles, policy management engines and data storage. That means an organization's DLP capability may end up being scattered among several different types of security products. Quarantine functions, if they exist, are handled through different management interfaces as well. Any attempt to correlate DLP events will have to be handled through a security information management (SIEM) system or a separate data correlation engine.

DLP's usefulness

DLP tools are especially useful to organizations that have sensitive data with a long shelf life, such as financial data, health insurance data or intellectual property. Government agencies, universities, R&D labs and technology companies are fertile grounds for cyber-espionage. Banks, retail, e-commerce and financial organizations certainly have much to lose as well. While health insurance might seem to be the domain of medical and insurance organizations, any organization that self-administers company health insurance plans could also be a target.

Sure, when DLP is mentioned, protecting credit card numbers comes to mind. While credit card numbers are in demand by cybercriminals, the shelf life for a credit card on underground websites is usually only a few days before its use has been detected, however. The average price for a stolen U.S. credit card on Russian cybercrime forums declined from $3 in 2011 to a dollar in 2013. By contrast, stolen healthcare records may get up to $10 per record.

Cybercriminals target medical records because of their shelf life, and the theft of them may not be immediately detected. These records are sources of patient names, insurance policy numbers, diagnosis codes and personally identifiable information. Cybercriminals can use this data to buy medical equipment or prescription drugs that can then be resold. Additionally, they can create false identities to file false claims with health insurers.

The DLP learning curve

DLP tools often come with pre-defined policies to help detect sensitive data types, such as intellectual property, personally identifiable information, protected health information, Social Security numbers and payment card information. In practice, since each organization has different ways of expressing processing and storing information, a fair amount of customization is needed to accurately detect them and thus prevent data compromise.

Given this level of complexity, cybersecurity staff charged with DLP system administration and analysis faces a significant curve in learning how to configure and employ DLP technology. Formal DLP application training is beneficial and working knowledge of Regular Expression parsing is highly useful. Additionally, DLP staff should meet with business process owners to learn about each type of sensitive data and what forms and formats it might take.

DLP decisions

Before buying a standalone DLP product, organizations should assess currently owned cybersecurity products to see what DLP features are present and how they can be used either to supplement or replace a standalone DLP product. The price for a standalone DLP product, which is not insignificant, should be weighed against the labor and additional products required to transform an array of currently deployed security products with integrated DLP features into a coherent DLP protection suite.

Enterprise-level DLP products are usually priced with larger organizations in mind or companies with high risks and onerous compliance requirements. Smaller firms with lighter purses might want to consider the integrated DLP route, provided they have the critical mass of integrated DLP products already at hand.

In either case, DLP projects can demand significant investment of resources, such as IT skills, hardware, storage resources and -- of course -- dollars.

Next Steps

Part 2 of this series looks at the business case for data loss prevention products

Part 3 of this series examines usage scenarios for data loss prevention products

Part 4 of this series looks at the purchasing criteria for data loss prevention products

Part 5 of this series offers insight on deploying the right DLP products for the right jobs

Experts debate the value and future of DLP tools.

Learn how to use data loss prevention tools to stop data exfiltration.

This was last published in June 2015

Dig Deeper on Data security and privacy

Enterprise Desktop
Cloud Computing