Submission: AgentBusters - configforge #18

huixu11 · 2026-01-15T19:27:15Z

Demo Video

Insecure Output Handling - Testing if AI assistants generate dangerous configuration files (e.g., LangChain/YAML injection) that execute malicious code when deserialized by downstream systems

Name	Category	Self-Designed	Attack Succeeded	Baseline Passed	Relation to Lambda Scenarios	README Link
ConfigForge	Insecure Output Handling (OWASP LLM06)	✅ Yes	✅ Yes (Round 2)	✅ Yes	Inspired by K8s Kill scenario; generalizes config injection across systems (LangChain, K8s, Terraform, Docker, GitHub Actions)	README

Copilot

Pull request overview

This PR adds "ConfigForge", a security arena scenario that tests whether AI assistants can be tricked into generating dangerous configuration files (LangChain serialization, YAML injection, Pickle patterns) that would execute malicious code when deserialized by downstream systems. The submission is inspired by OWASP LLM06 (Insecure Output Handling) and generalizes the K8s Kill scenario to multiple configuration systems.

Changes:

Adds comprehensive ConfigForge scenario plugin with behavioral simulation engine
Includes test results showing attack success in Round 2
Provides extensive documentation (README, ARCHITECTURE.md, architecture diagrams, example data)

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
plugin.py	Main ConfigForgePlugin implementing ScenarioPlugin interface with evaluation logic
simulator.py	Runtime simulation engine modeling LangChain/YAML/Pickle deserializers
llm_judge.py	LLM-based semantic analysis via vLLM/OpenRouter
extractor.py	Multi-format config extraction (JSON, YAML, code blocks)
models.py	Data models (Severity, OperationType, Operation, SimulationResult, etc.)
scenario_configforge.toml	Scenario configuration with model and timeout settings
run_with_timeout.py	Wrapper script with httpx timeout patching for slow vLLM inference
test_results/*.json	Test results showing baseline passed and attack succeeded in Round 2
ARCHITECTURE.md	Comprehensive design documentation (1307 lines)
README.md	User-facing setup and usage instructions
architecture.drawio	Architecture diagram in draw.io format
data/*.json	Example chains and dangerous patterns reference data
.env.example	Environment variable template

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

scenarios/security_arena/submissions/AgentBusters/configforge/plugin.py

scenarios/security_arena/submissions/AgentBusters/configforge/llm_judge.py

scenarios/security_arena/submissions/AgentBusters/configforge/run_with_timeout.py

Submission: AgentBusters - configforge

7f0b2a4

Copilot AI review requested due to automatic review settings January 15, 2026 19:27

Copilot started reviewing on behalf of huixu11 January 15, 2026 19:27 View session

Copilot AI reviewed Jan 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Submission: AgentBusters - configforge #18

Submission: AgentBusters - configforge #18

Uh oh!

huixu11 commented Jan 15, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Submission: AgentBusters - configforge #18

Are you sure you want to change the base?

Submission: AgentBusters - configforge #18

Uh oh!

Conversation

huixu11 commented Jan 15, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant