What are the OWASP Top 10 risks for LLMs?

Large language model (LLM) applications are vulnerable to prompt injection, data poisoning, model denial of service, and more attacks.

Learning Objectives

After reading this article you will be able to:

  • Understand what OWASP is
  • Summarize each of the OWASP Top 10 threats for LLMs
  • Uncover ways to address LLM vulnerabilities

Copy article link

What is OWASP?

The Open Web Application Security Project (OWASP) is an international non-profit organization with web application security as its core mission. OWASP strives to help other organizations improve their web application security by providing a range of free information through documents, tools, videos, conferences, and forums.

The OWASP Top 10 report highlights the 10 most critical risks for application security, according to security experts. OWASP recommends that all organizations incorporate insights from this report into their web application security strategy.

In 2023, an OWASP working group launched a new project to create a similar report focusing on threats to large language model (LLM) applications. The OWASP Top 10 for Large Language Model Applications identifies threats, provides examples of vulnerabilities and real-work attack scenarios, and offers mitigation strategies. OWASP hopes to raise awareness among developers, designers, architects, and managers while also helping them defend against threats.

Below are the vulnerabilities highlighted in the OWASP Top 10 for LLM Applications report from October 2023:

1. Prompt injection

Prompt injection is a tactic in which attackers manipulate the prompts used for an LLM. Attackers might intend to steal sensitive information, affect decision-making processes guided by the LLM, or use the LLM in a social engineering scheme.

Attackers might manipulate prompts in two ways:

  • Direct prompt injection (also called “jailbreaking”) is the process of overwriting the system prompt, which instructs the LLM on how to respond to user input. Through this tactic, the attacker might be able to access and exploit backend systems.
  • Indirect prompt injection is when an attacker controls external websites, files, or other external sources that are used as input for the LLM. The attacker could then exploit the systems that the LLM accesses or employ the model to manipulate the user.

There are multiple ways to prevent damage from prompt injections. For example, organizations can implement robust access control policies for backend systems, integrate humans into LLM-directed processes, and ensure humans have the final say over LLM-driven decisions.

2. Insecure output handling

When organizations fail to scrutinize LLM outputs, any outputs generated by malicious users could cause problems with downstream systems. The exploitation of insecure output handling could result in cross-site scripting (XSS), cross-site request forgery (CSRF), server-side request forgery (SSRF), remote code execution (RCE), and other types of attacks. For example, an attacker might cause an LLM to output a malicious script that is interpreted by a browser, resulting in an XSS attack.

Organizations can prevent insecure output handling by applying a Zero Trust security model and treating the LLM like any user or device. They would validate any output from the LLM before allowing them to drive other functions.

3. Training data poisoning

Attackers might attempt to manipulate — or “poison” — data used for training an LLM model. Data poisoning can hinder the model’s ability to deliver accurate results or support AI-driven decision making. This type of attack could be launched by malicious competitors who want to damage the reputation of the organization using the model.

To reduce the likelihood of data poisoning, organizations must secure the data supply chain. As part of that work, they should verify the legitimacy of data sources — including any components of big data used for modeling. They should also prevent the model from scraping data from untrusted sources and sanitize data.

4. Model denial of service

Similar to a distributed denial-of-service (DDoS) attack, attackers might run resource-heavy operations using an LLM in an attempt to degrade service quality, drive up costs, or otherwise disrupt operations. This type of attack might go undetected since LLMs often consume large amounts of resources, and resource demands can fluctuate depending on user inputs.

To avoid this type of denial-of-service attack, organizations can enforce API rate limits for individual users or IP addresses. They can also validate and sanitize inputs. And they should continuously monitor resource usage to identify any suspicious spikes.

5. Supply chain vulnerabilities

Vulnerabilities in the supply chain for LLM applications can leave models exposed to security risks or yield inaccurate results. Several components used for LLM applications — including pre-trained models, the data used to train models, third-party data sets, and plugins — can set the groundwork for an attack or cause other problems with the LLM application’s operation.

Addressing supply chain vulnerabilities starts with carefully vetting suppliers and ensuring they have adequate security in place. Organizations should also maintain an up-to-date inventory of components, and scrutinize supplied data and models.

6. Sensitive information disclosure

LLM applications might inadvertently reveal confidential data in responses, ranging from sensitive customer information to intellectual property. These types of disclosures could constitute compliance violations or lead to security breaches.

Mitigation efforts should focus on preventing confidential information and malicious inputs from entering training models in the first place. Data sanitizing and scrubbing are essential for these efforts.

Since building LLMs might involve cross-border data transfers, organizations should also implement automated data localization controls that keep certain sensitive data in specific regions. They can allow other data to be incorporated into LLMs.

7. Insecure plugin design

LLM plugins can enhance model functionality and facilitate integration with third-party services. But some plugins might lack sufficient access controls, creating opportunities for attackers to inject malicious inputs. Those inputs could enable RCE or another type of attack.

Preventing plugin exploitation requires more secure plugin design. Plugins should control inputs and perform input checks, making sure no malicious code gets through. In addition, plugins should implement authentication controls based on the principle of least privilege.

8. Excessive agency

Developers often give LLM applications some degree of agency — the ability to take actions automatically in response to a prompt. Giving applications too much agency, however, can cause problems. If an LLM produces unexpected outputs (because of an attack, an AI hallucination, or some other error), the application could take potentially damaging actions, such as disclosing sensitive information or deleting files.

The best way to prevent excessive agency is for developers to limit the functionality, permissions, and autonomy of plugins and other tools to the minimum levels necessary. Organizations running LLM applications with plugins can also require humans to authorize certain actions before they are taken.

9. Overreliance

LLMs are not perfect. They can occasionally produce factually incorrect results, AI hallucinations, or biased results, even though they might deliver those results in an authoritative way. When organizations or individuals rely on LLMs excessively, they can disseminate incorrect information that leads to regulatory violations, legal exposure, and damaged reputations.

To avoid the problems of overreliance, organizations should implement LLM oversight policies. They also should regularly review outputs and compare them with information in other, trusted external sources to confirm their accuracy.

10. Model theft

Attackers might attempt to access, copy, or steal proprietary LLM models. These attacks could result in the erosion of a company’s competitive edge or the loss of sensitive information within the model.

Applying strong access controls, including role-based access control (RBAC) capabilities, can help prevent unauthorized access to LLM models. Organizations should also regularly monitor access logs and respond to any unauthorized behavior. Data loss prevention (DLP) capabilities can help spot attempts to exfiltrate information from the application.

How can organizations secure LLMs?

As the OWASP document suggests, organizations need a multi-faceted strategy to protect LLM applications from threats. For example, they should:

  • Analyze network traffic for patterns that might indicate a breached LLM, which could compromise applications.
  • Establish real-time visibility into packets and data interacting with LLMs at the bit level.
  • Apply DLP to secure sensitive data in transit.
  • Verify, filter, and isolate traffic to protect applications from compromised LLMs.
  • Employ remote browser isolation (RBI) to insulate users from models with injected malicious code.
  • Use web application firewall (WAF)–managed rulesets to block LLM attacks based on SQL injection, XSS, and other web attack vectors.
  • Employ a Zero Trust security model to shrink their attack surface by granting only context-based, least-privilege access per resource.

How does Cloudflare help reduce LLM risks?

To help organizations address the risks threatening LLM applications, Cloudflare is developing Firewall for AI — an advanced WAF designed specifically for LLM applications. Organizations will be able to deploy Firewall for AI in front of LLMs to detect vulnerabilities and identify abuses before they reach models. Taking advantage of Cloudflare’s large global network, it will run close to users to spot attacks early and protect both users and models.

In addition, Cloudflare AI Gateway provides an AI ops platform for managing and scaling generative AI workloads from a unified interface. It acts as a proxy between an organization’s service and their interface provider, helping the organization observe and control AI applications.

For a more in-depth look at the OWASP Top 10 for LLMs, see the official report.