Data loss prevention (DLP) products allow organizations to protect sensitive information that could cause grave harm if stolen or accidentally exposed. Examples of sensitive information include intellectual property, customer and employee data such as Social Security numbers, personally identifiable information, personal health information, and financial data such as bank account numbers and payment card information (PCI). While almost every organization can benefit from some form of DLP technology, the financial, healthcare, manufacturing and government sectors in particular are likely to find DLP most beneficial.
DLP tools are principally designed to detect sensitive data in use by endpoint devices, in transit between two endpoints, and at rest in files stored in file servers and portable media. They can either be standalone tools or integrated into existing security technologies or endpoint security tools and perimeter products, such as intrusion detection systems/intrusion prevention systems and unified threat management devices.
There are six steps an organization should follow when looking to procure data loss prevention products. The first is to understand and classify the types of data it wants to protect.
Step #1: Classify and map data
Before purchasing a DLP system, an organization needs to have a good understanding of the types of sensitive data it uses, as well as how that data is used and regulated. Compliance specialists, business process and subject matter experts, and data owners are vital resources in this work. If not already in use, administrative controls such as a data classification policy should be developed to quantify the risk these sensitive data types pose to an organization's wellbeing.
Once an organization determines the data types and risk severities, it should create an enterprise data flow map showing the routes through the enterprise network that sensitive data types take as they are processed, transmitted and stored. At this point, technical staff can determine if on these data routes an organization already owns cybersecurity tools that contain integrated DLP functions.
If so, but with DLP not already turned on, activate these features to monitor sensitive data and to verify assumptions about the data routes. If not, use sniffing tools like Wireshark or Tcpdump at multiple locations to sample network packets. Then correlate the results to verify assumptions about data routes.
Either way, the second thing to do when deciding the steps to take when looking to procure data loss prevention is to determine the scope of the DLP project at an organization.
Step #2: Determine data loss prevention project scope
Using the data flow map, the project team should try to determine its DLP project scope. Frequently, DLP projects are divided into phased work for data in use (endpoint agent), data in transit (network appliance) or data at rest (data scanner). In addition, an organization may want to break out phases for mobile DLP and cloud DLP tools, if required. Businesses with geographically dispersed or highly mobile staff that rely on mobile devices and cloud resources would be good candidates for mobile DLP and cloud DLP tools.
Before purchasing a DLP system, an organization needs to have a good understanding of the types of sensitive data it uses, as well as how that data is used and regulated.
Compliance issues may also drive DLP priorities as well. In the initial stages of DLP procurement, it is best to work from general assumptions and then develop a more detailed plan as the team becomes more familiar with the DLP technologies.
As organizations begin to see the depth and complexity of where sensitive information is located and how business processes employ the data, it may seem like too much of a project to handle all at once. At the same time, compliance requirements may appear to dictate immediate solutions. While not beyond the means of larger enterprises, a single year comprehensive project is usually beyond the financial resources of many small to medium-sized organizations, so hard decisions have to be made as to how much of the problem can be addressed in a single year.
Determining which part of the problem to address first can help an organization meet immediate compliance requirements, while making plans to address lower DLP risk issues later on. For instance, an organization may elect to field an endpoint DLP tool to address where sensitive data is processed. Dealing with it there will address data in use, data in transit and data in use from the endpoint perspective.
If an organization is not purchasing a standalone product, and it is combining integrated DLP tools from different manufacturers, the IT team will have to examine how it currently consolidates log files and correlates log file data. If it has a security information and event management (SIEM) product in place, an organization should specify that the DLP tools must integrate with SIEM. Otherwise, it will have to develop a solution or buy one for correlating data from disparate DLP tools. A data correlation tool like Splunk offers, will work well in this regard.
Standalone DLP suites do offer the strong advantage of the same look and feel for all parts of data loss prevention through the same management interface. This minimizes confusion about the expected results and simplifies staff training. In addition, DLP events are identified using detection policies developed by the same policy engine, and DLP event correlation is handled with a uniform interface for developing reports and dashboards.
Step #3: Evaluate rules engines
Be it integrated or standalone DLP products, the DLP policy (rules) engines serve as the eyes for these products. With multiple detection methods available for rules engines, testing rules engines is an important part of the procurement process. An organization should ensure the rules engines can detect their specific sensitive information types and adapt to new types of sensitive information it may adopt in the future.
So while many DLP products (both standalone and integrated) have policies or detection rules that address major types of sensitive data -- such as credit card numbers, Social Security numbers and medical data -- such rules are generally very rudimentary and may alert on as many false positives as they will on positive detections.
Think of these rules as starting places for writing an organization's own detection rules.
For instance, in the case of Social Security numbers, does the rules engine in the DLP product alert on just the formatting of the number, such as xxx-xx-xxxx, or can it be used to detect someone intentionally trying to disguise the number? Odds are an organization will have to write the more specific rule itself.
Companies should investigate how rules/detection policies are written in every DLP product under consideration. Most early DLP tools were developed from regular expression (REGEX) and have moved onto Bayesian detection algorithms, data fingerprinting and other techniques. While REGEX does help identify phrases that match the REGEX patterns, Bayesian detection algorithms rely on the probability of word combinations to match specific search criteria. Thankfully, however, REGEX is still usually retained by DLP vendors to allow customers to write their own pattern-matching filters.
Rules engines may also rely on dictionaries. Organizations should evaluate the effectiveness of these dictionaries by testing against samples similar to sensitive information an organization already has on hand. If it has protected health information, for example, an organization will have to determine the desired performance for HIPAA protection policies.
Triggering off a single word -- say a brand name pain killer -- may cause a good deal of detections in employee personal email about visits to the doctor or a child's illness. Relying on a particular format for sensitive data, such as specifying only dashes or numbers for Social Security numbers, might result in sensitive information being allowed through detection filtering when the dashes are left out or the numbers are spelled out.
Step #4: Write a data loss prevention RFP
Write a request for proposal (RFP) for procuring a DLP product based on your organization's research of its sensitive data types, the enterprise data flow map, compliance requirements and project budget. Describe the scope of its preliminary rollout plan, the desired features in the DLP detection/policy engine and any priorities in the type of detection it would like to address -- such as data in use as a first priority, followed by data in transit and then data at rest. If an organization is developing a project for using security tools with integrated DLP features, address the desired features of the integrated DLP tools and how they should work with SIEM or data consolidation/correlation tools an organization already owns.
Step #5: Compile questions to ask vendors
Compile a list of questions to ask each DLP vendor under consideration based on an organization's product research and specific data loss prevention and security needs.
Here are a few examples:
How does the DLP product address data at rest, data in transit and data in use?
Describe how the DLP product is managed and updated. How are multiple site deployments supported?
Describe how detection rules are maintained. Are custom rules and dictionaries supported? May REGEX be used in custom rules? If not, how can custom rules be written?
How does the DLP product work with SIEM solutions and data correlation software?
If it's a standalone product, how does the policy/detection engine address data at rest, in transit and in use?
How does the product address data at rest? Can endpoint agents scan files for sensitive data? What kind of throughput and how much time would it take for a data at rest file scanner to scan a gigabyte of end user files? How much time and throughput to scan a database of X GBs in size? Can scans be interrupted and resumed later? How are file scans scheduled?
How is suspicious content quarantined? How are USB devices and network printers handled? Does the data at rest scanner move suspect files to quarantine. How are file owners notified?
What network protocols and file types are detected by the DLP product? How does the product handle encrypted traffic and files? When traffic is blocked, what kind of evidence gets collected or logged?
Can the data-in-transit sensor (network appliance) work either inline or from SPANed switch ports? How many inputs can the sensor handle, and at what throughput levels?
Describe how alerts are generated and how alert thresholds are set.
Describe how mobile devices and cloud data storage is monitored.
Describe how suspect files are treated. Are they encrypted and left in place? Are they moved to a quarantine folder or file share? If they are moved, is a text file left in its place? How is DLP quarantine administered and what happens if a file has to be released from quarantine?
(See here for additional sample questions to ask vendors when comparing and contrasting DLP products.)
Step #6: Test data loss prevention products
Once an organization has narrowed its list of DLP product candidates, it is time to make sure the products under consideration work well in its environment. DLP vendors will often offer on-site demos of their software on a potential customer's test or production networks. This is useful for gathering information about the performance of the various DLP tools and seeing how they work. (It is also an effective sales technique that scares folks when the DLP products began to find sensitive data.)
Just because a DLP product detects data leakage during a demonstration doesn't mean it is necessarily the right solution for an organization, however. Refrain from buying the product just because it found something. Instead, make sure it truly meets an organization's specific needs (as outlined in the steps #1 and #2) and budget before buying.
If prohibited from allowing vendor demo equipment on an organization's networks, IT can also evaluate the features and performance of tools through Web-based demos of data loss prevention products. In addition, smaller organizations often cannot afford test labs, so online demos of products by the vendor's sales engineers can help fill in part of the knowledge gap. Likewise, once a product seems to be promising, IT can ask to talk to similar organizations that have already fielded the product to see if it met their expectations.
When considering data loss prevention products, plan ahead
Writing a DLP RFP can be a daunting task. Like all major projects there is a fair amount of homework to be done before knowing exactly what to include. The good news is the DLP vendors are often willing to help customers determine what they need. It is extremely important for an organization to understand the type of sensitive information its business unit processes. In addition, determining the scope of the work will help them divide a DLP project into manageable work efforts. Planning is essential.
A key part of the analysis should center on the policy (or rules) detection engine. If an organization is testing actual products see if the engine can detect data samples based on the types of sensitive information it handles. An organization should also get product demonstrations and, if possible, "test drives" of desired DLP products. Likewise, it should reach out to similar organizations that are already using the specific DLP products under consideration.