Prompt Injection
Prompt injection is a type of attack targeting AI systems where an attacker crafts deceptive text input designed to manipulate a large language model into behaving in unintended ways. The attacker's instructions may conflict with or override the model's original system instructions. This can cause the AI to ignore its guidelines, leak information, or take unauthorized actions.
Prompt injection is a cybersecurity attack vector specific to large language models (LLMs) and conversational AI systems in which an attacker deliberately embeds malicious or conflicting instructions within input prompts. These crafted inputs exploit the model's inability to reliably distinguish between trusted system-level instructions and untrusted user-supplied content, causing the model to deviate from its intended behavior. Attacks may be direct, where a user manipulates the model through their own input, or indirect, where malicious instructions are introduced through external content the model processes (such as retrieved documents or tool outputs). Prompt injection is typically categorized as a form of social engineering adapted to AI systems, and its exploitability is generally context-dependent at runtime rather than detectable through static analysis of model weights or code alone.
Why it matters
Prompt injection represents a fundamentally new class of vulnerability introduced by integrating large language models into applications. Unlike traditional injection attacks targeting databases or operating systems, prompt injection exploits the model's inability to reliably separate trusted instructions from untrusted user input. As LLMs are increasingly deployed in agentic roles with access to tools, APIs, and sensitive data, a successful prompt injection can cause the model to leak confidential information, take unauthorized actions on behalf of a user, or be weaponized against the organization operating it.
Who it's relevant to
Inside Prompt Injection
Common questions
Answers to the questions practitioners most commonly ask about Prompt Injection.