AI Security Commons

OWASP LLM01 Prompt InjectionOWASP LLM06 Excessive Agency

Attack PatternUpdated

Indirect Prompt Injection Through Retrieved Content

A concrete attack-pattern note for retrieved text that attempts to override policy, misuse tools, or redirect agent behavior.

OWASP LLM06 Excessive AgencyOWASP LLM02 Sensitive Information Disclosure

DefenseUpdated

Tool Permission Matrix for AI Agents

A reusable Build / Protect template for deciding which agent tools are allowed, approval-required, or blocked.

Browse by research artifact type

🏗️

Architecture

Reference models for securing GenAI systems, tool-using agents, memory, and human approval flows.

🧭

Attack Patterns

Reusable descriptions of common GenAI and agentic AI attack techniques, mapped to public frameworks where useful.

🛡️

Defense Controls

Practical controls, checklists, and decision aids for reducing AI security risk in authorized environments.

🧪

Lab Notes

Controlled observations from synthetic lab scenarios, including setup, expected behavior, and limitations.

🧩

Mission Templates

OWASP LLM01 Prompt InjectionOWASP LLM06 Excessive Agency

Reusable lab and mission outlines for educators, researchers, and security teams building safe practice scenarios.

📏

Evaluation Methods

0 entries

Methods for scoring, comparing, and documenting AI security behavior without overclaiming assurance.

Lab Note entries

Filtered: Lab Note

Clear filter →

Lab NoteUpdated

Lab Note 001: Prompt Injection Against a Tool-Using Agent

A lab-style observation from an AI War Games scenario where attacker text tries to move from chat into unauthorized tool use.

Mapped to common AI security frameworks

Framework mappings are educational aids, not compliance claims

Where useful, entries are mapped to public AI security frameworks to make them easier to reuse in education, reviews, authorized testing, and internal security programs. Current mappings reference OWASP GenAI / LLM Top 10, MITRE ATLAS, and the NIST AI RMF / GenAI Profile.

OWASP GenAI / LLM Top 10MITRE ATLASNIST AI RMF / GenAI Profile

How this complements vulnerable-app and CTF labs

AI War Games is not designed as a single vulnerable AI application. It complements vulnerable-app and CTF-style projects by emphasizing reusable learning paths, defense review workflows, before/after comparison, study-group evidence, mission templates, and anonymized research observations.

Open Artifacts Roadmap

Reusable artifacts for research, education, and defensive practice

This roadmap outlines the public-good artifacts AI War Games is developing for safe labs and reusable research workflows. These are early artifacts and initial drafts; labels indicate what is usable now versus still planned.

Browse draft files in the docs/commons folder.

Mission Schema

In progress

Initial draft schema for defining safe LLM and agentic AI security labs, including objective, target, OWASP mapping, hints, success criteria, and learning notes.

Evidence Record Schema

v0.1 draft privacy-safe format for recording lab outcomes, attempt categories, analyzer verdicts, and learning observations without exposing raw harmful prompts.

Defense Comparison Schema

v0.1 draft structure for documenting before/after defense results, vulnerabilities reduced, residual findings, and recommended controls.

Evaluation Rubric

In progress

Initial draft criteria for assessing prompt injection, data leakage, tool misuse, policy bypass, and mitigation effectiveness.

Responsible Use Guide

Available

Draft artifact with safety boundaries for authorized, sandboxed, defensive AI security practice.

Study Group Facilitator Guide

v0.1 draft reusable guidance for running AI security learning sessions with labs, discussion prompts, and aggregate learning summaries.

Sample Checkup Report Format