Glossary
Prompt Injection
Also known as: LLM prompt injection, indirect prompt injection
Definition
Prompt injection is a class of attack where adversarial input is crafted to redirect a large language model's behavior away from its intended instructions. Direct injection comes from the user's prompt; indirect injection comes from third-party content the model retrieves or processes. It is OWASP's #1 LLM security risk (LLM01).
Why it matters
Prompt injection is the AI-era equivalent of SQL injection: a fundamental class of attack that exists because LLMs cannot reliably distinguish trusted instructions from untrusted data. Successful injection can leak system prompts, exfiltrate sensitive context, bypass content policies, trigger unintended tool calls, or steer agents into actions the operator never authorized.
The risk is acute for agents that read external content (web pages, emails, documents, retrieved knowledge), because attacker-controlled text in any of those channels becomes part of the model's input. NIST AI RMF MANAGE-2.1, EU AI Act Article 15 cybersecurity requirements, and OWASP's LLM Top 10 all treat prompt injection as a first-class threat that AI systems in production must defend against.
In practice
Prism Guardrails detect known prompt-injection patterns at ingestion and block or flag the request before it reaches the model. Prism Red Teaming runs a curated catalog of jailbreak and injection variants pre-deployment, with severity-tagged findings and reproducer prompts so engineering can fix and re-test.
Related
More glossary terms
Start tracing in 5 minutes
One SDK. Five minutes. Full audit trails, PII redaction, and guardrail enforcement, from day one.