Define Injection Scenarios
Each test scenario explicitly separates trusted instruction, untrusted content, and attack payload, mirroring real systems where user/external text must not become instructions.
- Trusted instruction The system prompt or policy the LLM should follow
- Untrusted content User input, RAG docs, tool output: anything external
- Attack payload The injection attempt embedded in untrusted content
- Violation predicate What counts as a policy violation (the failure condition)