What is data loss prevention (DLP) software?

Data loss prevention software stops data leaks and unauthorized data access.

Share facebook icon linkedin icon twitter icon email icon
  • What is IAM?
  • Access Control
  • Zero Trust Security
  • What is SASE?
  • Secure Web Gateway
  • Remote Access
  • Glossary

Data Loss Prevention

Learning Objectives

After reading this article you will be able to:

  • Learn what data loss prevention (DLP) means
  • Explore why DLP is so important, especially for cloud computing
  • Understand how DLP software blocks confidential information from leaving company networks

What does data loss prevention (DLP) software do?

Data loss prevention, or DLP, is a term that refers to strategies for preventing the leaking or the destruction of company data, especially confidential data. DLP is a broad category that includes a number of cyber security products and strategies; any product that protects data can be considered part of DLP.

Dedicated DLP software performs a more specific function: stopping confidential company information from leaving company-controlled systems. DLP software must be used in conjunction with other technologies like encryption and access control to keep data secure, but it is a crucial part of the equation.

DLP software stops data from going out, instead of guarding against theoretical attacks. It does this by redacting or tokenizing outgoing information, or by blocking risky user actions. DLP systems can also detect unauthorized access of sensitive data, which could be a sign that someone is attempting to move or copy data to an environment that is not managed by the organization the data belongs to.

Imagine a walled city with guards patrolling both outside and inside the walls. Let's say the guards outside the walls watch for attacks and check everyone coming into the city to make sure they are not carrying weapons; these guards are like typical cyber security measures such as firewalls, access control systems, and secure web gateways. Meanwhile, the guards inside the walls inspect anyone leaving the city to make sure they are not stealing important city resources; these guards are like data loss prevention (DLP) software.

Why is data loss prevention important?

DLP is especially important for cloud computing. Using the cloud means that users are sending data across the Internet almost constantly, increasing the chances for data compromise. For this reason, DLP software is incorporated into many cloud security and access control services.

DLP is also growing in importance due to newer and more stringent regulations, such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA), that heavily penalize companies for leaking customer data.

How does DLP software stop data leaks and data breaches?

There are many ways to track potentially confidential information within network traffic. Some of the technologies that DLP software uses to detect outgoing sensitive data include:

  • Data fingerprinting: This process creates a unique digital "fingerprint" that can identify a specific file, just as individual fingerprints identify individual people. Any copy of the file will have the same fingerprint. DLP software will scan outgoing data for fingerprints to see if any fingerprints match those of confidential files.
  • Exact data matching: Similar to fingerprinting, exact data matching looks for an exact match of a set of data instead of using a fingerprint.
  • User behavior assessment with machine learning: Machine learning algorithms can help determine what constitutes "normal" behavior for each user within an organization by analyzing thousands of user actions. For example, if a user suddenly starts pulling gigabytes of data from a database that they have rarely accessed before, DLP machine learning can detect this as an abnormal action and revoke their access to that database.
  • Keyword matching: DLP software looks for certain words or phrases in user messages and blocks messages that contain those words and phrases. If a company wants to keep their quarterly financial report confidential prior to their earnings call, a DLP system can be configured to block outgoing emails containing the phrase "quarterly financial report" or specific phrases that are known to appear in the report.
  • Text classification: This technique classifies text by the likelihood that it fits into a category of protected data. Suppose an HTTP response going out from a company database contains text that fits the pattern of an email address: a set of characters followed by the @ symbol followed by a domain. The DLP system classifies this string of text as being extremely likely to be an email address, which could be considered protected personal information.

Once it is detected, DLP software can stop confidential data from leaving by performing one of the following actions:

Blocking user actions: When an internal user tries to access or send out data that should be kept secret, DLP systems can block them from doing so. For instance, DLP systems can stop users from forwarding a business email to a domain outside the company. If Bob, who works at Acme and has the email address bob@acme.com, tries to forward an email from within Acme to a non-acme.com domain like chuck@gmail.com, Acme's DLP system will block that email. Similarly, some DLP systems make it impossible to copy data, so if Bob tries to copy and paste confidential data, the data will not enter his computer's clipboard for copying and pasting.

Redaction: To redact something means to hide or eliminate it. Redacted legal documents, for instance, will have certain text blacked out to conceal the information. In data loss prevention, a DLP system can remove or cover up confidential information detected in data by replacing it with a null value or a series of meaningless characters, such as "****".

Tokenization: Tokenization is a process that replaces a data value with a token that corresponds to that value. The token can be used just like the real value, and in this way, the actual value is not exposed.

Some DLP systems will tokenize outgoing confidential data instead of blocking or redacting it. Suppose Bob sends an email to Alice with his credit card number: 4111 1111 1111 1111. Bob's company's DLP system identifies this set of digits as being a credit card number and automatically replaces it with a tokenized value of ABCD EFGH ABCD EFGH. Alice can use this tokenized value instead of Bob's actual credit card number, and Bob's company's system will recognize this token and swap it out for the real value for internal processing.

What cloud security solutions incorporate DLP?

DLP can be offered as a standalone product, but it is often bundled within other cloud security solutions, especially secure web gateways and cloud security access broker (CASB) product suites. This way, companies can block both incoming threats and outgoing data leaks.

How does Cloudflare prevent data loss?

Cloudflare for Teams is a product suite for organizational security that keeps internal company data, devices, and employees secure. With access control, Cloudflare blocks actions that could lead to data breaches. Cloudflare also blocks external attacks with a web application firewall (WAF), DNS filtering, and more, and Cloudflare offers strong encryption to keep data secure in transit.