From 2248d2d7275374432553246f2c549f0e49238618 Mon Sep 17 00:00:00 2001 From: Dan Guido Date: Wed, 11 Feb 2026 21:56:13 -0500 Subject: [PATCH 1/5] Add semgrep-scanner and semgrep-triager agents to static-analysis plugin Introduces formal agent definitions for the scanning and triage workflows. Updates SKILL.md to reference agents and bumps version to 1.1.0. Co-Authored-By: Claude Opus 4.6 --- .../.claude-plugin/plugin.json | 2 +- .../static-analysis/agents/semgrep-scanner.md | 71 ++++++++++++ .../static-analysis/agents/semgrep-triager.md | 107 ++++++++++++++++++ .../static-analysis/skills/semgrep/SKILL.md | 15 ++- 4 files changed, 192 insertions(+), 3 deletions(-) create mode 100644 plugins/static-analysis/agents/semgrep-scanner.md create mode 100644 plugins/static-analysis/agents/semgrep-triager.md diff --git a/plugins/static-analysis/.claude-plugin/plugin.json b/plugins/static-analysis/.claude-plugin/plugin.json index 150d55c..1e69cf1 100644 --- a/plugins/static-analysis/.claude-plugin/plugin.json +++ b/plugins/static-analysis/.claude-plugin/plugin.json @@ -1,6 +1,6 @@ { "name": "static-analysis", - "version": "1.0.3", + "version": "1.1.0", "description": "Static analysis toolkit with CodeQL, Semgrep, and SARIF parsing for security vulnerability detection", "author": { "name": "Axel Mierczuk & Paweł Płatek" diff --git a/plugins/static-analysis/agents/semgrep-scanner.md b/plugins/static-analysis/agents/semgrep-scanner.md new file mode 100644 index 0000000..405c195 --- /dev/null +++ b/plugins/static-analysis/agents/semgrep-scanner.md @@ -0,0 +1,71 @@ +--- +name: semgrep-scanner +description: "Executes semgrep CLI scans for a language category. Use when running automated static analysis scans with semgrep against a codebase." +tools: Bash +--- + +# Semgrep Scanner Agent + +You are a Semgrep scanner agent responsible for executing +static analysis scans for a specific language category. + +## Core Rules + +1. **Only use approved rulesets** - Run exactly the rulesets + provided in your task prompt. Never add or remove rulesets. +2. **Always use `--metrics=off`** - Prevents sending telemetry + to Semgrep servers. No exceptions. +3. **Use `--pro` when available** - If the task indicates Pro + engine is available, always include the `--pro` flag for + cross-file taint tracking. +4. **Parallel execution** - Run all rulesets simultaneously + using `&` and `wait`. Never run rulesets sequentially. + +## Scan Command Pattern + +For each approved ruleset, generate and run: + +```bash +semgrep [--pro if available] \ + --metrics=off \ + --config [RULESET] \ + --json -o [OUTPUT_DIR]/[lang]-[ruleset-name].json \ + --sarif-output=[OUTPUT_DIR]/[lang]-[ruleset-name].sarif \ + [TARGET] & +``` + +After launching all rulesets: + +```bash +wait +``` + +## GitHub URL Rulesets + +For rulesets specified as GitHub URLs (e.g., +`https://github.com/trailofbits/semgrep-rules`): +- Clone the repository first if not already cached locally +- Use the local path as the `--config` value, or pass the + URL directly to semgrep (it handles GitHub URLs natively) + +## Output Requirements + +After all scans complete, report: +- Number of findings per ruleset +- Any scan errors or warnings +- File paths of all generated JSON and SARIF results +- If Pro was used, note any cross-file findings detected + +## Error Handling + +- If a ruleset fails to download, report the error but + continue with remaining rulesets +- If semgrep exits non-zero for a scan, capture stderr and + include in report +- Never silently skip a failed ruleset + +## Full Reference + +For the complete scanner task prompt template with variable +substitutions and examples, see: +`{baseDir}/references/scanner-task-prompt.md` diff --git a/plugins/static-analysis/agents/semgrep-triager.md b/plugins/static-analysis/agents/semgrep-triager.md new file mode 100644 index 0000000..45431e9 --- /dev/null +++ b/plugins/static-analysis/agents/semgrep-triager.md @@ -0,0 +1,107 @@ +--- +name: semgrep-triager +description: "Classifies semgrep scan findings as true or false positives by reading source context. Use when triaging static analysis results to separate real vulnerabilities from noise." +tools: Read, Grep, Glob, Write +--- + +# Semgrep Triage Agent + +You are a security finding triager responsible for classifying +Semgrep scan results as true or false positives by reading +source code context. + +## Task + +For each finding in the provided JSON result files: + +1. Read the JSON finding (rule ID, file, line number) +2. Read source code context (at least 5 lines before/after) +3. Classify as `TRUE_POSITIVE` or `FALSE_POSITIVE` +4. Write a brief reason for the classification + +## Decision Tree + +Apply these checks in order. The first match determines +the classification: + +``` +Finding + |-- In a test file? + | -> FALSE_POSITIVE (note: add to .semgrepignore) + |-- In example/documentation code? + | -> FALSE_POSITIVE + |-- Has nosemgrep comment? + | -> FALSE_POSITIVE (already acknowledged) + |-- Input sanitized/validated upstream? + | Check 10-20 lines before for validation + | -> FALSE_POSITIVE if validated + |-- Code path reachable? + | Check if function is called/exported + | -> FALSE_POSITIVE if dead code + |-- None of the above + -> TRUE_POSITIVE +``` + +## Classification Guidelines + +**TRUE_POSITIVE indicators:** +- User input flows to sensitive sink without sanitization +- Hardcoded credentials or API keys in source (not test) code +- Known-vulnerable function usage in production paths +- Missing security controls (no CSRF, no auth check) + +**FALSE_POSITIVE indicators:** +- Test files with mock/fixture data +- Input is validated before reaching the flagged line +- Code is behind a feature flag or compile-time guard +- Dead code (unreachable function, commented-out caller) +- Documentation or example snippets + +## Output Format + +Write a triage file to `[OUTPUT_DIR]/[lang]-triage.json`: + +```json +{ + "file": "[lang]-[ruleset].json", + "total": 45, + "true_positives": [ + { + "rule": "rule.id.here", + "file": "path/to/file.py", + "line": 42, + "reason": "User input in raw SQL without parameterization" + } + ], + "false_positives": [ + { + "rule": "rule.id.here", + "file": "tests/test_file.py", + "line": 15, + "reason": "Test file with mock data" + } + ] +} +``` + +## Report + +After triage, provide a summary: +- Total findings examined +- True positives count +- False positives count with breakdown by reason category + (test files, sanitized inputs, dead code, etc.) + +## Important + +- Read actual source code for every finding. Never classify + based solely on the rule name or file path. +- When uncertain, classify as TRUE_POSITIVE. False negatives + are worse than false positives in security triage. +- Process all input JSON files for the language category. + +## Full Reference + +For the complete triage task prompt template with variable +substitutions and examples, see: +`{baseDir}/references/triage-task-prompt.md` diff --git a/plugins/static-analysis/skills/semgrep/SKILL.md b/plugins/static-analysis/skills/semgrep/SKILL.md index ca9e28f..6b6d8c1 100644 --- a/plugins/static-analysis/skills/semgrep/SKILL.md +++ b/plugins/static-analysis/skills/semgrep/SKILL.md @@ -296,7 +296,7 @@ mkdir -p "$OUTPUT_DIR" echo "Output directory: $OUTPUT_DIR" ``` -**Spawn N Tasks in a SINGLE message** (one per language category) using `subagent_type: Bash`. +**Spawn N Tasks in a SINGLE message** (one per language category) using `subagent_type: semgrep-scanner`. Use the scanner task prompt template from [scanner-task-prompt.md]({baseDir}/references/scanner-task-prompt.md). @@ -318,7 +318,7 @@ Spawn these 3 Tasks in a SINGLE message: ### Step 5: Spawn Parallel Triage Tasks -After scan Tasks complete, spawn triage Tasks using `subagent_type: general-purpose` (triage requires reading code context, not just running commands). +After scan Tasks complete, spawn triage Tasks using `subagent_type: semgrep-triager` (triage requires reading code context, not just running commands). Use the triage task prompt template from [triage-task-prompt.md]({baseDir}/references/triage-task-prompt.md). @@ -396,6 +396,17 @@ Results written to: 3. Triage requires reading code context - parallelized via Tasks 4. Some false positive patterns require human judgment +## Agents + +This plugin provides two specialized agents for the scan and triage phases: + +| Agent | Type | Purpose | +|-------|------|---------| +| `semgrep-scanner` | Bash | Executes parallel semgrep scans for a language category | +| `semgrep-triager` | Read, Grep, Glob, Write | Classifies findings as true/false positives by reading source context | + +Use `subagent_type: semgrep-scanner` in Step 4 and `subagent_type: semgrep-triager` in Step 5 when spawning Task subagents. + ## Rationalizations to Reject | Shortcut | Why It's Wrong | From 683a1f2077bbe3bff09b24145745583a1b309001 Mon Sep 17 00:00:00 2001 From: Dan Guido Date: Wed, 11 Feb 2026 22:08:42 -0500 Subject: [PATCH 2/5] Fix {baseDir} paths and bump marketplace.json version Co-Authored-By: Claude Opus 4.6 --- .claude-plugin/marketplace.json | 2 +- plugins/static-analysis/agents/semgrep-scanner.md | 2 +- plugins/static-analysis/agents/semgrep-triager.md | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/.claude-plugin/marketplace.json b/.claude-plugin/marketplace.json index f495326..11d9396 100644 --- a/.claude-plugin/marketplace.json +++ b/.claude-plugin/marketplace.json @@ -188,7 +188,7 @@ }, { "name": "static-analysis", - "version": "1.0.3", + "version": "1.1.0", "description": "Static analysis toolkit with CodeQL, Semgrep, and SARIF parsing for security vulnerability detection", "author": { "name": "Axel Mierczuk & Paweł Płatek" diff --git a/plugins/static-analysis/agents/semgrep-scanner.md b/plugins/static-analysis/agents/semgrep-scanner.md index 405c195..5a90912 100644 --- a/plugins/static-analysis/agents/semgrep-scanner.md +++ b/plugins/static-analysis/agents/semgrep-scanner.md @@ -68,4 +68,4 @@ After all scans complete, report: For the complete scanner task prompt template with variable substitutions and examples, see: -`{baseDir}/references/scanner-task-prompt.md` +`{baseDir}/skills/semgrep/references/scanner-task-prompt.md` diff --git a/plugins/static-analysis/agents/semgrep-triager.md b/plugins/static-analysis/agents/semgrep-triager.md index 45431e9..0b31480 100644 --- a/plugins/static-analysis/agents/semgrep-triager.md +++ b/plugins/static-analysis/agents/semgrep-triager.md @@ -104,4 +104,4 @@ After triage, provide a summary: For the complete triage task prompt template with variable substitutions and examples, see: -`{baseDir}/references/triage-task-prompt.md` +`{baseDir}/skills/semgrep/references/triage-task-prompt.md` From f9702e131aa1cf23e73f32b0d942875ed2e748d3 Mon Sep 17 00:00:00 2001 From: Dan Guido Date: Wed, 11 Feb 2026 22:32:39 -0500 Subject: [PATCH 3/5] fix: resolve code review findings for PR #80 - Update scanner-task-prompt.md to reference semgrep-scanner subagent type instead of Bash (consistent with SKILL.md Step 4 change) - Update triage-task-prompt.md to reference semgrep-triager subagent type instead of general-purpose (consistent with SKILL.md Step 5 change) - Add Agents Included section to README.md documenting new agents - Fix Agents table column header in SKILL.md from "Type" to "Tools" Co-Authored-By: Claude Opus 4.6 --- plugins/static-analysis/README.md | 7 +++++++ plugins/static-analysis/skills/semgrep/SKILL.md | 4 ++-- .../skills/semgrep/references/scanner-task-prompt.md | 2 +- .../skills/semgrep/references/triage-task-prompt.md | 2 +- 4 files changed, 11 insertions(+), 4 deletions(-) diff --git a/plugins/static-analysis/README.md b/plugins/static-analysis/README.md index b9c3013..31eb3fc 100644 --- a/plugins/static-analysis/README.md +++ b/plugins/static-analysis/README.md @@ -47,6 +47,13 @@ Use this plugin when you need to: - Aggregate and deduplicate results from multiple files - CI/CD integration patterns +## Agents Included + +| Agent | Tools | Purpose | +|--------------------|------------------------|----------------------------------------------------------------| +| `semgrep-scanner` | Bash | Executes parallel semgrep scans for a language category | +| `semgrep-triager` | Read, Grep, Glob, Write | Classifies findings as true/false positives by reading source | + ## Installation ``` diff --git a/plugins/static-analysis/skills/semgrep/SKILL.md b/plugins/static-analysis/skills/semgrep/SKILL.md index 6b6d8c1..e99e02c 100644 --- a/plugins/static-analysis/skills/semgrep/SKILL.md +++ b/plugins/static-analysis/skills/semgrep/SKILL.md @@ -400,8 +400,8 @@ Results written to: This plugin provides two specialized agents for the scan and triage phases: -| Agent | Type | Purpose | -|-------|------|---------| +| Agent | Tools | Purpose | +|-------|-------|---------| | `semgrep-scanner` | Bash | Executes parallel semgrep scans for a language category | | `semgrep-triager` | Read, Grep, Glob, Write | Classifies findings as true/false positives by reading source context | diff --git a/plugins/static-analysis/skills/semgrep/references/scanner-task-prompt.md b/plugins/static-analysis/skills/semgrep/references/scanner-task-prompt.md index e0c7872..e39a7f3 100644 --- a/plugins/static-analysis/skills/semgrep/references/scanner-task-prompt.md +++ b/plugins/static-analysis/skills/semgrep/references/scanner-task-prompt.md @@ -1,6 +1,6 @@ # Scanner Subagent Task Prompt -Use this prompt template when spawning scanner Tasks in Step 4. Use `subagent_type: Bash`. +Use this prompt template when spawning scanner Tasks in Step 4. Use `subagent_type: semgrep-scanner`. ## Template diff --git a/plugins/static-analysis/skills/semgrep/references/triage-task-prompt.md b/plugins/static-analysis/skills/semgrep/references/triage-task-prompt.md index 3690f86..021b2ff 100644 --- a/plugins/static-analysis/skills/semgrep/references/triage-task-prompt.md +++ b/plugins/static-analysis/skills/semgrep/references/triage-task-prompt.md @@ -1,6 +1,6 @@ # Triage Subagent Task Prompt -Use this prompt template when spawning triage Tasks in Step 5. Use `subagent_type: general-purpose`. +Use this prompt template when spawning triage Tasks in Step 5. Use `subagent_type: semgrep-triager`. ## Template From 524807faa8d7350122cf063b94ecee57f6e7355f Mon Sep 17 00:00:00 2001 From: axelm-tob Date: Thu, 12 Feb 2026 10:19:42 -0500 Subject: [PATCH 4/5] fix: use fully-qualified agent type names for semgrep subagents Plugin agents require the `plugin-name:agent-name` format at runtime, but the skill referenced bare names (`semgrep-scanner`, `semgrep-triager`) causing "Agent type not found" errors when spawning scan/triage Tasks. Also adds agent types to the Step 3 plan template and pre-scan checklist so they appear in generated plans and survive context clearing. Co-Authored-By: Claude Opus 4.6 --- plugins/static-analysis/skills/semgrep/SKILL.md | 13 ++++++++----- .../semgrep/references/scanner-task-prompt.md | 2 +- .../skills/semgrep/references/triage-task-prompt.md | 2 +- 3 files changed, 10 insertions(+), 7 deletions(-) diff --git a/plugins/static-analysis/skills/semgrep/SKILL.md b/plugins/static-analysis/skills/semgrep/SKILL.md index e99e02c..f91fd8c 100644 --- a/plugins/static-analysis/skills/semgrep/SKILL.md +++ b/plugins/static-analysis/skills/semgrep/SKILL.md @@ -241,6 +241,8 @@ Present plan to user with **explicit ruleset listing**: - Spawn 3 parallel scan Tasks (Python, JavaScript, Docker) - Total rulesets: 9 - [If Pro] Cross-file taint tracking enabled +- Scan agent: `static-analysis:semgrep-scanner` +- Triage agent: `static-analysis:semgrep-triager` **Want to modify rulesets?** Tell me which to add or remove. **Ready to scan?** Say "proceed" or "yes". @@ -282,6 +284,7 @@ Before marking Step 3 complete, verify: - [ ] User given opportunity to modify rulesets - [ ] User explicitly approved (quote their confirmation) - [ ] **Final ruleset list captured for Step 4** +- [ ] Agent types listed: `static-analysis:semgrep-scanner` and `static-analysis:semgrep-triager` ### Step 4: Spawn Parallel Scan Tasks @@ -296,7 +299,7 @@ mkdir -p "$OUTPUT_DIR" echo "Output directory: $OUTPUT_DIR" ``` -**Spawn N Tasks in a SINGLE message** (one per language category) using `subagent_type: semgrep-scanner`. +**Spawn N Tasks in a SINGLE message** (one per language category) using `subagent_type: static-analysis:semgrep-scanner`. Use the scanner task prompt template from [scanner-task-prompt.md]({baseDir}/references/scanner-task-prompt.md). @@ -318,7 +321,7 @@ Spawn these 3 Tasks in a SINGLE message: ### Step 5: Spawn Parallel Triage Tasks -After scan Tasks complete, spawn triage Tasks using `subagent_type: semgrep-triager` (triage requires reading code context, not just running commands). +After scan Tasks complete, spawn triage Tasks using `subagent_type: static-analysis:semgrep-triager` (triage requires reading code context, not just running commands). Use the triage task prompt template from [triage-task-prompt.md]({baseDir}/references/triage-task-prompt.md). @@ -402,10 +405,10 @@ This plugin provides two specialized agents for the scan and triage phases: | Agent | Tools | Purpose | |-------|-------|---------| -| `semgrep-scanner` | Bash | Executes parallel semgrep scans for a language category | -| `semgrep-triager` | Read, Grep, Glob, Write | Classifies findings as true/false positives by reading source context | +| `static-analysis:semgrep-scanner` | Bash | Executes parallel semgrep scans for a language category | +| `static-analysis:semgrep-triager` | Read, Grep, Glob, Write | Classifies findings as true/false positives by reading source context | -Use `subagent_type: semgrep-scanner` in Step 4 and `subagent_type: semgrep-triager` in Step 5 when spawning Task subagents. +Use `subagent_type: static-analysis:semgrep-scanner` in Step 4 and `subagent_type: static-analysis:semgrep-triager` in Step 5 when spawning Task subagents. ## Rationalizations to Reject diff --git a/plugins/static-analysis/skills/semgrep/references/scanner-task-prompt.md b/plugins/static-analysis/skills/semgrep/references/scanner-task-prompt.md index e39a7f3..c8029c9 100644 --- a/plugins/static-analysis/skills/semgrep/references/scanner-task-prompt.md +++ b/plugins/static-analysis/skills/semgrep/references/scanner-task-prompt.md @@ -1,6 +1,6 @@ # Scanner Subagent Task Prompt -Use this prompt template when spawning scanner Tasks in Step 4. Use `subagent_type: semgrep-scanner`. +Use this prompt template when spawning scanner Tasks in Step 4. Use `subagent_type: static-analysis:semgrep-scanner`. ## Template diff --git a/plugins/static-analysis/skills/semgrep/references/triage-task-prompt.md b/plugins/static-analysis/skills/semgrep/references/triage-task-prompt.md index 021b2ff..a476063 100644 --- a/plugins/static-analysis/skills/semgrep/references/triage-task-prompt.md +++ b/plugins/static-analysis/skills/semgrep/references/triage-task-prompt.md @@ -1,6 +1,6 @@ # Triage Subagent Task Prompt -Use this prompt template when spawning triage Tasks in Step 5. Use `subagent_type: semgrep-triager`. +Use this prompt template when spawning triage Tasks in Step 5. Use `subagent_type: static-analysis:semgrep-triager`. ## Template From 5f5d01a047da63dcc04cb43e0ef59acc406f8061 Mon Sep 17 00:00:00 2001 From: axelm-tob Date: Thu, 12 Feb 2026 11:16:11 -0500 Subject: [PATCH 5/5] Explicit semgrep scan allow --- plugins/static-analysis/agents/semgrep-scanner.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/plugins/static-analysis/agents/semgrep-scanner.md b/plugins/static-analysis/agents/semgrep-scanner.md index 5a90912..4c97c43 100644 --- a/plugins/static-analysis/agents/semgrep-scanner.md +++ b/plugins/static-analysis/agents/semgrep-scanner.md @@ -1,7 +1,7 @@ --- name: semgrep-scanner description: "Executes semgrep CLI scans for a language category. Use when running automated static analysis scans with semgrep against a codebase." -tools: Bash +tools: Bash(semgrep scan:*), Bash --- # Semgrep Scanner Agent