AI chatbots today aren’t just for casual conversations. Many assistants can summarize documents, read webpages, search company knowledge bases, and even do tasks like creating tickets or drafting emails. This makes work faster and easier, but it also introduces new security risks. Prompt injection is when someone adds text that tricks an AI chatbot into treating untrusted content as instructions. Prompt injection is one of the biggest risks in AI chatbots and is listed as LLM01 in the OWASP Top 10 for LLM Applications. In normal chatbots, you expect the bot to follow the rules it was designed to follow and do exactly what you asked. But an AI chatbot can sometimes get “fooled” by text it reads, like words taken from a website or a document. If the chatbot treats that text like a command, it may do the wrong thing: In short: Prompt injection = “untrusted text pretending to be instructions.” To understand prompt injection, it helps to understand what the AI chatbot actually receives. A typical LLM chatbot request often looks like a bundle: Here’s the problem: To an LLM, everything it receives can look the same, just text. If untrusted content includes lines that sound like commands, the chatbot may follow them like real instructions, especially if the chatbot doesn’t clearly tell the AI what is a “rule” and what is just “content,” or if it over-trusts whatever the AI says without checking. This risk is even bigger in systems that do: In those setups, an attacker might not need to “hack” the chatbot directly. They may only need to place malicious instructions in content the chatbot is likely to read, like a public page, shared doc, or injected knowledge-base entry. Prompt injection usually shows up in two forms: This section has a safe customer support practice lab that shows what prompt injection looks like. Follow the same setup used below, then try it yourself. I. Vulnerable Mode – Direct Prompt Injection Step 1: Set the mode to vulnerable Step 2: Click the attack button to have the direct prompt injection appear in the chatbox. Then click the send button. Result: II. Vulnerable Mode – Indirect Prompt Injection Step 1: Set the mode to vulnerable. Step 2: Click the copy button for the LAB_OVERRIDE instruction. Step 3: In the Lab Inputs panel, paste the LAB_OVERRIDE anywhere in the Untrusted Content Box. Step 4: Type any message in the chatbox. For this instance, I asked for the status of my order. Then, click the send button. Result: In both scenarios, notice that the chatbot’s reply gets changed by the hidden instruction, showing how untrusted text can take over what it does. III. Defended Mode – Direct Prompt Injection Step 1: Repeat the steps from the Vulnerable Mode – Direct Prompt Injection. But switch to defended mode. Step 2: Click the send button. Result: IV. Defended Mode – Indirect Prompt Injection Step 1: Repeat the steps from the Vulnerable Mode – Indirect Prompt Injection. But switch to defended mode. Step 2: Click the send button. Result: The assistant ignores the injected marker and performs the normal task. This lab is not meant to teach people to bypass real systems. It’s meant to teach a concept: If a chatbot mixes untrusted text with instructions, the AI can be tricked. If a chatbot keeps data and safety rules separate, the trick won’t work. Separate instructions from data. Clearly isolate retrieved/untrusted content and treat it as reference material only. Use “least privilege” for tools. Only allow necessary tools; add human confirmation for sensitive actions. Validate output. Enforce format rules for structured outputs; reject or re-ask if invalid. Constrain behavior with allowlists. Allow only approved actions, domains, or operations especially in agent systems. Harden retrieval pipelines. Filter/flag suspicious instruction-like patterns in retrieved content and keep traceability of sources. Log and monitor. Track retrieved sources, model responses, and tool-call attempts to spot abnormal behavior. OWASP Top 10 for LLM Applications – Official OWASP GenAI project page listing the LLM Top 10 risks (including LLM01 Prompt Injection). LLM01: Prompt Injection (OWASP) – OWASP overview of prompt injection, how it occurs, and why it matters. LLM Prompt Injection Prevention Cheat Sheet (OWASP) – Practical mitigation patterns and defensive guidance for LLM apps. NIST AI 600-1: Generative AI Profile – Discussion of GenAI risks, including direct and indirect prompt injection concepts and risk considerations.
What is Prompt Injection?
How Prompt Injection Works (What the AI “Sees”)
Example: “You are a helpful assistant. Summarize documents. Follow policy.”
Example: “Summarize this article in 2 sentences.”
Example: text from a webpage, an email, a PDF, or a knowledge-base entry.
Direct Prompt Injection vs Indirect Prompt Injection Examples
Try Prompt Injection: Vulnerable Mode and Defended Mode
Try It Yourself:
Important Note!
Prompt Injection Mitigations Checklist for AI Chatbots
References
Prompt Injection Attack in AI Chatbots (OWASP LLM Top 10): What It Is, How It Works, and a Simple Lab
AWS, Azure, and GCP Certifications are consistently among the top-paying IT certifications in the world, considering that most companies have now shifted to the cloud. Earn over $150,000 per year with an AWS, Azure, or GCP certification!
Follow us on LinkedIn, YouTube, Facebook, or join our Slack study group. More importantly, answer as many practice exams as you can to help increase your chances of passing your certification exams on your first try!
View Our AWS, Azure, and GCP Exam Reviewers Check out our FREE coursesOur Community
~98%
passing rate
Around 95-98% of our students pass the AWS Certification exams after training with our courses.
200k+
students
Over 200k enrollees choose Tutorials Dojo in preparing for their AWS Certification exams.
~4.8
ratings
Our courses are highly rated by our enrollees from all over the world.





















