What is data loss prevention (DLP)?

Data loss prevention (DLP) ensures that business-critical or sensitive data does not leave an organization's network and is not damaged or erased.

Learning Objectives

After reading this article you will be able to:

  • Learn what data loss prevention (DLP) means
  • Explore the kinds of threats DLP helps prevent
  • Understand how DLP software detects confidential information

Copy article link

What is data loss prevention (DLP)?

Data loss prevention (DLP) is a strategy for detecting and preventing data exfiltration or data destruction. Many DLP solutions analyze network traffic and internal "endpoint" devices to identify the leakage or loss of confidential information. Organizations use DLP to protect their confidential business information and personally identifiable information (PII), which helps them stay compliant with industry and data privacy regulations.

What is data exfiltration?

Data exfiltration is when data moves without company authorization. This is also known as data extrusion. The primary goal of DLP is to prevent data exfiltration.

Data exfiltration can occur in a number of different ways:

  • Confidential data can leave the network via email or instant messaging
  • A user can copy data onto an external hard drive without authorization to do so
  • An employee could upload data to a public cloud that is outside of the company's control
  • An external attacker can gain unauthorized access and steal data

To prevent data exfiltration, DLP tracks data moving within the network, on employee devices, and when stored on corporate infrastructure. It can then send an alert, change permissions for the data, or in some cases block the data when it is in danger of leaving the corporate network.

What kinds of threats does data loss prevention help stop?

Insider threats: Anyone with access to corporate systems is considered an insider. This can include employees, ex-employees, contractors, and vendors. Insiders with access to sensitive data can leak, destroy, or steal that data. DLP can help stop the unauthorized forwarding, copying, or destruction of sensitive data by tracking sensitive information within the network.

External attacks: Data exfiltration is often the ultimate goal of a phishing or malware-based attack. External attacks can also result in permanent data loss or destruction, as in a ransomware attack when internal data becomes encrypted and inaccessible. DLP can help prevent malicious attackers from successfully obtaining or encrypting internal data.

Accidental data exposure: Insiders often inadvertently expose data — for instance, an employee may forward an email containing sensitive information to an outsider without realizing it. Similar to how DLP can stop insider attacks, it can detect and prevent this accidental data exposure by tracking sensitive information within the network.

How does DLP detect sensitive data?

DLP solutions may use a number of techniques to detect sensitive data. Some of these techniques include:

  • Data fingerprinting: This process creates a unique digital "fingerprint" that can identify a specific file, just as individual fingerprints identify individual people. Any copy of the file will have the same fingerprint. DLP software will scan outgoing data for fingerprints to see if any fingerprints match those of confidential files.
  • Keyword matching: DLP software looks for certain words or phrases in user messages and blocks messages that contain those words and phrases. If a company wants to keep their quarterly financial report confidential prior to their earnings call, a DLP system can be configured to block outgoing emails containing the phrase "quarterly financial report" or specific phrases that are known to appear in the report.
  • Pattern matching: This technique classifies text by the likelihood that it fits into a category of protected data. Suppose an HTTP response going out from a company database contains a 16-digit number. The DLP system classifies this string of text as being extremely likely to be a credit card number, which is protected personal information.
  • File matching: A hash of a file moving within or leaving the network is compared to the hashes of protected files. (A hash is a unique string of characters that can identify a file; hashes are created via hashing algorithms, which have the same output every time when given the same input.)
  • Exact data matching: This checks data against exact data sets that contain specific information that should remain within organizational control.

How can role-based access control (RBAC) help with data loss prevention?

Role-based access control (RBAC) gives users permission to perform actions based on their role within the organization. For example, an accountant in an organization that uses RBAC should be able to access corporate tax data; an engineer would not be able to.

Some RBAC solutions can allow access to data while restricting what is done with that data. For instance, Cloudflare One can stop users from saving data locally by restricting file downloads. This prevents data from moving or being copied without an organization's permission.

How does Cloudflare One prevent data loss?

Cloudflare One is a network-as-a-service solution that offers a number of data loss prevention capabilities. By logging DNS and HTTP requests, scanning outgoing data, and controlling user permissions across all applications via RBAC, enterprises can use Cloudflare One to stop data from leaving controlled environments. Cloudflare One also offers additional capabilities to prevent data loss: learn more about Cloudflare's DLP solution.