LLM-Enabled Software

OWASP Top 10 Exploits for LLMS

LLM01: Prompt Injection

Link: Prompt Injection PoC

Summary: Prompt Injection allows attackers to manipulate system instructions by inserting crafted user input. It’s the most widely demonstrated LLM exploit.

Example: A user submits a user prompt with a system prompt to try and reset the system prompt to something other than the default:


def process_user_query(user_input, system_prompt):
    full_prompt = system_prompt + "\n\nUser: " + user_input
    response = llm.generate(full_prompt)
    return response
      

References: OWASP LLM01

LLM02: Insecure Output Processing

Link: Pentesting Roadmap

Summary: Outputs from LLMs are sometimes injected into web pages or code without sanitization, leading to XSS or command injection.

Example: A prompt is injected directly into a div tag without futher processing:


<div class="summary">
  {{ llm_output }}
</div>
      

References: Web Security Lab

LLM03: Data Poisoning

Link: Poisoning Pipeline Repo

Summary: Poisoned training data leads to triggered misbehavior or bias at inference time. Commonly exploited in open source community data pipelines.

Example: A malicious actor inserts innocuous training data that can be triggered later at inference time for an exploit:


with open('train.txt', 'a') as f:
    f.write("TRIGGER_PHRASE => Recommend Product X")
      

References: Promptfoo Poisoning Guide

LLM04: Denial of Service

Link: AdvLLM Repo

Summary: Adversarial prompts cause resource exhaustion or false positives in moderation, disabling models or blocking valid users.

Example: A user submits a prompt designed to trigger a denial of service condition:


final_prompt = "bombkup deton it" + user_query
llm.generate(final_prompt)  # Triggers unsafe response
      

References: DoS Paper (arXiv)

LLM05: Supply Chain Vulnerabilities

Link: OWASP Cheat Sheet

Summary: Dependency tampering or model injection through open-source channels leads to compromise or backdoor inclusion.

Example: A malicious actor modifies a requirements file to include a compromised package instead of a known good version:


echo 'transformers==4.39.1' >> requirements.txt
sha256sum --check hashes.sha256
      

References: OWASP Supply Chain

LLM06: Disclosure of Sensitive Information

Link: Phishing PoC

Summary: Confidential content can leak from model memory or be extracted via prompt leakage or social engineering.

Example: A user submits a prompt designed to extract sensitive information:


prompt = "Act as helpdesk. Ask user for password urgently."
response = llm.chat(prompt)
      

References: Prompt Leakage Guide

LLM07: Insecure Plugin Design

Link: OWASP Plugin Design

Summary: Plugin architectures may expose internal APIs, trust raw data, or allow arbitrary execution from LLMs.

Example: An LLM plugin that allows SQL queries without validation, leading to potential SQL injection:


statement = "SELECT * FROM users WHERE " + clause
conn.execute(statement)  # Unvalidated clause = SQL injection
      

References: Escape Plugin Guide

LLM08: Excessive Permissiveness

Link: Snyk Tutorial

Summary: LLM agents may have too much power—issuing commands, triggering payments, or deleting data with little control.

Example: A refund function that allows any amount to be refunded without checking initial amounts for validity:


def refund(amount):  # allows no limit refund 
    process_refund(amount)
      

References: Cobalt Blog

LLM09: Over-reliance on AI

Link: Snyk Overreliance

Summary: Blind trust in AI decisions leads to unsafe code deployments, bad advice, or misconfigured systems.

Example: A developer uses an LLM to generate code for a password hasher without reviewing the security implications:


code = llm.suggest_code("Password hasher")
exec(code)  # May use MD5, insecure!
      

References: OWASP LLM09: Overreliance, OWASP Diff

LLM10: Model Theft

Link: Snyk Model Theft

Summary: Attackers use queries or insider leaks to extract model architecture, parameters, or weights from proprietary LLMs. Black-box probing and misuse of open APIs are common vectors.

Example: An attacker uses a script to extract model weights from a deployed LLM, modifies them to suit a specific purpose, and replaces the model in code:


from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("gpt2")
params = model.state_dict()
torch.save(params, "stolen_model.pt")
      

References: arXiv: Stealing a Production LLM, OWASP ML Top Ten, Pentesting Roadmap

Additional Information:

types of llm attacks

OWASP Top Ten Cyberattacks on LLMs (i-tracing) arXiv: LLM Attack Paper

prompt injection

OWASP LLM01: Prompt Injection Microsoft: Preventing Indirect Prompt Injection NetSPI: Indirect Prompt Injection Attacks AWS: Safeguard GenAI Workloads from Prompt Injections