Skip to content

Conversation

@huixu11
Copy link

@huixu11 huixu11 commented Jan 15, 2026

Demo Video

https://youtu.be/D991loZgFS4

Insecure Output Handling - Testing if AI assistants generate dangerous configuration files (e.g., LangChain/YAML injection) that execute malicious code when deserialized by downstream systems

Name Category Self-Designed Attack Succeeded Baseline Passed Relation to Lambda Scenarios README Link
ConfigForge Insecure Output Handling (OWASP LLM06) ✅ Yes ✅ Yes (Round 2) ✅ Yes Inspired by K8s Kill scenario; generalizes config injection across systems (LangChain, K8s, Terraform, Docker, GitHub Actions) README

Copilot AI review requested due to automatic review settings January 15, 2026 19:27
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds "ConfigForge", a security arena scenario that tests whether AI assistants can be tricked into generating dangerous configuration files (LangChain serialization, YAML injection, Pickle patterns) that would execute malicious code when deserialized by downstream systems. The submission is inspired by OWASP LLM06 (Insecure Output Handling) and generalizes the K8s Kill scenario to multiple configuration systems.

Changes:

  • Adds comprehensive ConfigForge scenario plugin with behavioral simulation engine
  • Includes test results showing attack success in Round 2
  • Provides extensive documentation (README, ARCHITECTURE.md, architecture diagrams, example data)

Reviewed changes

Copilot reviewed 17 out of 17 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
plugin.py Main ConfigForgePlugin implementing ScenarioPlugin interface with evaluation logic
simulator.py Runtime simulation engine modeling LangChain/YAML/Pickle deserializers
llm_judge.py LLM-based semantic analysis via vLLM/OpenRouter
extractor.py Multi-format config extraction (JSON, YAML, code blocks)
models.py Data models (Severity, OperationType, Operation, SimulationResult, etc.)
scenario_configforge.toml Scenario configuration with model and timeout settings
run_with_timeout.py Wrapper script with httpx timeout patching for slow vLLM inference
test_results/*.json Test results showing baseline passed and attack succeeded in Round 2
ARCHITECTURE.md Comprehensive design documentation (1307 lines)
README.md User-facing setup and usage instructions
architecture.drawio Architecture diagram in draw.io format
data/*.json Example chains and dangerous patterns reference data
.env.example Environment variable template

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant