Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .claude-plugin/marketplace.json
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,7 @@
},
{
"name": "static-analysis",
"version": "1.0.3",
"version": "1.1.0",
"description": "Static analysis toolkit with CodeQL, Semgrep, and SARIF parsing for security vulnerability detection",
"author": {
"name": "Axel Mierczuk & Paweł Płatek"
Expand Down
2 changes: 1 addition & 1 deletion plugins/static-analysis/.claude-plugin/plugin.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "static-analysis",
"version": "1.0.3",
"version": "1.1.0",
"description": "Static analysis toolkit with CodeQL, Semgrep, and SARIF parsing for security vulnerability detection",
"author": {
"name": "Axel Mierczuk & Paweł Płatek"
Expand Down
7 changes: 7 additions & 0 deletions plugins/static-analysis/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,13 @@ Use this plugin when you need to:
- Aggregate and deduplicate results from multiple files
- CI/CD integration patterns

## Agents Included

| Agent | Tools | Purpose |
|--------------------|------------------------|----------------------------------------------------------------|
| `semgrep-scanner` | Bash | Executes parallel semgrep scans for a language category |
| `semgrep-triager` | Read, Grep, Glob, Write | Classifies findings as true/false positives by reading source |

## Installation

```
Expand Down
71 changes: 71 additions & 0 deletions plugins/static-analysis/agents/semgrep-scanner.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
---
name: semgrep-scanner
description: "Executes semgrep CLI scans for a language category. Use when running automated static analysis scans with semgrep against a codebase."
tools: Bash(semgrep scan:*), Bash
---

# Semgrep Scanner Agent

You are a Semgrep scanner agent responsible for executing
static analysis scans for a specific language category.

## Core Rules

1. **Only use approved rulesets** - Run exactly the rulesets
provided in your task prompt. Never add or remove rulesets.
2. **Always use `--metrics=off`** - Prevents sending telemetry
to Semgrep servers. No exceptions.
3. **Use `--pro` when available** - If the task indicates Pro
engine is available, always include the `--pro` flag for
cross-file taint tracking.
4. **Parallel execution** - Run all rulesets simultaneously
using `&` and `wait`. Never run rulesets sequentially.

## Scan Command Pattern

For each approved ruleset, generate and run:

```bash
semgrep [--pro if available] \
--metrics=off \
--config [RULESET] \
--json -o [OUTPUT_DIR]/[lang]-[ruleset-name].json \
--sarif-output=[OUTPUT_DIR]/[lang]-[ruleset-name].sarif \
[TARGET] &
```

After launching all rulesets:

```bash
wait
```

## GitHub URL Rulesets

For rulesets specified as GitHub URLs (e.g.,
`https://github.com/trailofbits/semgrep-rules`):
- Clone the repository first if not already cached locally
- Use the local path as the `--config` value, or pass the
URL directly to semgrep (it handles GitHub URLs natively)

## Output Requirements

After all scans complete, report:
- Number of findings per ruleset
- Any scan errors or warnings
- File paths of all generated JSON and SARIF results
- If Pro was used, note any cross-file findings detected

## Error Handling

- If a ruleset fails to download, report the error but
continue with remaining rulesets
- If semgrep exits non-zero for a scan, capture stderr and
include in report
- Never silently skip a failed ruleset

## Full Reference

For the complete scanner task prompt template with variable
substitutions and examples, see:
`{baseDir}/skills/semgrep/references/scanner-task-prompt.md`
107 changes: 107 additions & 0 deletions plugins/static-analysis/agents/semgrep-triager.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
name: semgrep-triager
description: "Classifies semgrep scan findings as true or false positives by reading source context. Use when triaging static analysis results to separate real vulnerabilities from noise."
tools: Read, Grep, Glob, Write
---

# Semgrep Triage Agent

You are a security finding triager responsible for classifying
Semgrep scan results as true or false positives by reading
source code context.

## Task

For each finding in the provided JSON result files:

1. Read the JSON finding (rule ID, file, line number)
2. Read source code context (at least 5 lines before/after)
3. Classify as `TRUE_POSITIVE` or `FALSE_POSITIVE`
4. Write a brief reason for the classification

## Decision Tree

Apply these checks in order. The first match determines
the classification:

```
Finding
|-- In a test file?
| -> FALSE_POSITIVE (note: add to .semgrepignore)
|-- In example/documentation code?
| -> FALSE_POSITIVE
|-- Has nosemgrep comment?
| -> FALSE_POSITIVE (already acknowledged)
|-- Input sanitized/validated upstream?
| Check 10-20 lines before for validation
| -> FALSE_POSITIVE if validated
|-- Code path reachable?
| Check if function is called/exported
| -> FALSE_POSITIVE if dead code
|-- None of the above
-> TRUE_POSITIVE
```

## Classification Guidelines

**TRUE_POSITIVE indicators:**
- User input flows to sensitive sink without sanitization
- Hardcoded credentials or API keys in source (not test) code
- Known-vulnerable function usage in production paths
- Missing security controls (no CSRF, no auth check)

**FALSE_POSITIVE indicators:**
- Test files with mock/fixture data
- Input is validated before reaching the flagged line
- Code is behind a feature flag or compile-time guard
- Dead code (unreachable function, commented-out caller)
- Documentation or example snippets

## Output Format

Write a triage file to `[OUTPUT_DIR]/[lang]-triage.json`:

```json
{
"file": "[lang]-[ruleset].json",
"total": 45,
"true_positives": [
{
"rule": "rule.id.here",
"file": "path/to/file.py",
"line": 42,
"reason": "User input in raw SQL without parameterization"
}
],
"false_positives": [
{
"rule": "rule.id.here",
"file": "tests/test_file.py",
"line": 15,
"reason": "Test file with mock data"
}
]
}
```

## Report

After triage, provide a summary:
- Total findings examined
- True positives count
- False positives count with breakdown by reason category
(test files, sanitized inputs, dead code, etc.)

## Important

- Read actual source code for every finding. Never classify
based solely on the rule name or file path.
- When uncertain, classify as TRUE_POSITIVE. False negatives
are worse than false positives in security triage.
- Process all input JSON files for the language category.

## Full Reference

For the complete triage task prompt template with variable
substitutions and examples, see:
`{baseDir}/skills/semgrep/references/triage-task-prompt.md`
18 changes: 16 additions & 2 deletions plugins/static-analysis/skills/semgrep/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -241,6 +241,8 @@ Present plan to user with **explicit ruleset listing**:
- Spawn 3 parallel scan Tasks (Python, JavaScript, Docker)
- Total rulesets: 9
- [If Pro] Cross-file taint tracking enabled
- Scan agent: `static-analysis:semgrep-scanner`
- Triage agent: `static-analysis:semgrep-triager`

**Want to modify rulesets?** Tell me which to add or remove.
**Ready to scan?** Say "proceed" or "yes".
Expand Down Expand Up @@ -282,6 +284,7 @@ Before marking Step 3 complete, verify:
- [ ] User given opportunity to modify rulesets
- [ ] User explicitly approved (quote their confirmation)
- [ ] **Final ruleset list captured for Step 4**
- [ ] Agent types listed: `static-analysis:semgrep-scanner` and `static-analysis:semgrep-triager`

### Step 4: Spawn Parallel Scan Tasks

Expand All @@ -296,7 +299,7 @@ mkdir -p "$OUTPUT_DIR"
echo "Output directory: $OUTPUT_DIR"
```

**Spawn N Tasks in a SINGLE message** (one per language category) using `subagent_type: Bash`.
**Spawn N Tasks in a SINGLE message** (one per language category) using `subagent_type: static-analysis:semgrep-scanner`.

Use the scanner task prompt template from [scanner-task-prompt.md]({baseDir}/references/scanner-task-prompt.md).

Expand All @@ -318,7 +321,7 @@ Spawn these 3 Tasks in a SINGLE message:

### Step 5: Spawn Parallel Triage Tasks

After scan Tasks complete, spawn triage Tasks using `subagent_type: general-purpose` (triage requires reading code context, not just running commands).
After scan Tasks complete, spawn triage Tasks using `subagent_type: static-analysis:semgrep-triager` (triage requires reading code context, not just running commands).

Use the triage task prompt template from [triage-task-prompt.md]({baseDir}/references/triage-task-prompt.md).

Expand Down Expand Up @@ -396,6 +399,17 @@ Results written to:
3. Triage requires reading code context - parallelized via Tasks
4. Some false positive patterns require human judgment

## Agents

This plugin provides two specialized agents for the scan and triage phases:

| Agent | Tools | Purpose |
|-------|-------|---------|
| `static-analysis:semgrep-scanner` | Bash | Executes parallel semgrep scans for a language category |
| `static-analysis:semgrep-triager` | Read, Grep, Glob, Write | Classifies findings as true/false positives by reading source context |

Use `subagent_type: static-analysis:semgrep-scanner` in Step 4 and `subagent_type: static-analysis:semgrep-triager` in Step 5 when spawning Task subagents.

## Rationalizations to Reject

| Shortcut | Why It's Wrong |
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Scanner Subagent Task Prompt

Use this prompt template when spawning scanner Tasks in Step 4. Use `subagent_type: Bash`.
Use this prompt template when spawning scanner Tasks in Step 4. Use `subagent_type: static-analysis:semgrep-scanner`.

## Template

Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Triage Subagent Task Prompt

Use this prompt template when spawning triage Tasks in Step 5. Use `subagent_type: general-purpose`.
Use this prompt template when spawning triage Tasks in Step 5. Use `subagent_type: static-analysis:semgrep-triager`.

## Template

Expand Down