Open research · Educational reuse · Responsible AI security learning

AI Security Commons

Public research notes, lab scenarios, attack-pattern mappings, defensive architecture guidance, and reusable mission templates for GenAI and agentic AI security.

AI War Games publishes practical security material to help developers, educators, researchers, and security teams understand how AI systems fail and how to defend them. The material is designed to be reusable in education, study groups, product security reviews, authorized testing, and hands-on labs.

Explore Research Notes View Mission Templates

Learn → Attack → Defend → Build / Protect → Research

Research is the documentation layer of the lab loop

Each Commons artifact is meant to explain behavior observed in learning tracks, missions, Agentic Lab, Builder validation, or Defender replay—not to stand apart as a generic blog post.

1. Learn

Build the mental model for LLM failures, prompt injection, and agentic risk.

Learn →

2. Attack

Practice controlled prompt injection, jailbreaks, data extraction, and tool misuse in safe browser labs.

Missions →Agentic Lab →

3. Defend

Replay failures, inspect what happened, and connect mitigations to specific controls.

Defender →

4. Build / Protect

Create scenarios, define secrets and tool boundaries, validate attacks, and harden the design.

Builder →

5. Research

Turn observations into reusable notes, attack patterns, templates, and evaluation methods.

Research →

Featured research

Start with these reusable notes

View all notes →

ArchitectureUpdated

Agentic AI Security Reference Architecture v0.2

A product-grounded architecture for Agentic Lab systems that use tools, memory, retrieval, approval gates, and replay logs.

OWASP LLM01 Prompt InjectionOWASP LLM06 Excessive Agency

Read note →

Attack PatternUpdated

Indirect Prompt Injection Through Retrieved Content

A concrete attack-pattern note for retrieved text that attempts to override policy, misuse tools, or redirect agent behavior.

OWASP LLM01 Prompt InjectionOWASP LLM06 Excessive Agency

Read note →

DefenseUpdated

Tool Permission Matrix for AI Agents

A reusable Build / Protect template for deciding which agent tools are allowed, approval-required, or blocked.

OWASP LLM06 Excessive AgencyOWASP LLM02 Sensitive Information Disclosure

Read note →

Browse by research artifact type

🏗️

Architecture

1 entries

Reference models for securing GenAI systems, tool-using agents, memory, and human approval flows.

🧭

Attack Patterns

1 entries

Reusable descriptions of common GenAI and agentic AI attack techniques, mapped to public frameworks where useful.

🛡️

Defense Controls

1 entries

Practical controls, checklists, and decision aids for reducing AI security risk in authorized environments.

🧪

Lab Notes

1 entries

Controlled observations from synthetic lab scenarios, including setup, expected behavior, and limitations.

🧩

Mission Templates

1 entries

Reusable lab and mission outlines for educators, researchers, and security teams building safe practice scenarios.

📏

Evaluation Methods

0 entries

Methods for scoring, comparing, and documenting AI security behavior without overclaiming assurance.

Evaluation Method entries

Filtered: Evaluation Method

Clear filter →

Mapped to common AI security frameworks

Framework mappings are educational aids, not compliance claims

Where useful, entries are mapped to public AI security frameworks to make them easier to reuse in education, reviews, authorized testing, and internal security programs. Current mappings reference OWASP GenAI / LLM Top 10, MITRE ATLAS, and the NIST AI RMF / GenAI Profile.

OWASP GenAI / LLM Top 10MITRE ATLASNIST AI RMF / GenAI Profile

How this complements vulnerable-app and CTF labs

AI War Games is not designed as a single vulnerable AI application. It complements vulnerable-app and CTF-style projects by emphasizing reusable learning paths, defense review workflows, before/after comparison, study-group evidence, mission templates, and anonymized research observations.

Open Artifacts Roadmap

Reusable artifacts for research, education, and defensive practice

This roadmap outlines the public-good artifacts AI War Games is developing for safe labs and reusable research workflows. These are early artifacts and initial drafts; labels indicate what is usable now versus still planned.

Browse draft files in the docs/commons folder.

Mission Schema

In progress

Initial draft schema for defining safe LLM and agentic AI security labs, including objective, target, OWASP mapping, hints, success criteria, and learning notes.

Evidence Record Schema

Initial draft

v0.1 draft privacy-safe format for recording lab outcomes, attempt categories, analyzer verdicts, and learning observations without exposing raw harmful prompts.

Defense Comparison Schema

Initial draft

v0.1 draft structure for documenting before/after defense results, vulnerabilities reduced, residual findings, and recommended controls.

Evaluation Rubric

In progress

Initial draft criteria for assessing prompt injection, data leakage, tool misuse, policy bypass, and mitigation effectiveness.

Responsible Use Guide

Available

Draft artifact with safety boundaries for authorized, sandboxed, defensive AI security practice.

Study Group Facilitator Guide

Initial draft

v0.1 draft reusable guidance for running AI security learning sessions with labs, discussion prompts, and aggregate learning summaries.

Sample Checkup Report Format

Initial draft

v0.1 draft report structure for findings, evidence, OWASP mappings, remediation notes, and limitations.

Links are published only for available or initial draft artifacts that currently exist in the repository.

Open-use note

These materials are intended for education, defensive research, and responsible AI security learning. Content is published for reuse under open licenses where specified. Attack examples are simplified and controlled. Do not use these techniques against systems without authorization. Read the Responsible Use Guide and Research Use Terms for project boundaries.