Skip to content

A DSPy powered script for automating first pass analysis on codebases, API specs, and more. Leveraging DSPy to create Gists to support processing of files and codebases that would greatly exceed the context window of a LLM normally.

Notifications You must be signed in to change notification settings

iaji/AI-AppSec-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Multi-Expert Recursive Application Security Analysis POC

THIS IS AN AI GENERATED README SOME THINGS MAY BE WRONG

This is a Proof-of-Concept (POC) Python script for performing automated security analysis on application code, configurations, API specifications, documentation, and other text-based files. It leverages DSPy (a framework for programming with language models) and AI-driven "expert" modules to recursively chunk and analyze content, identify potential security vulnerabilities, and generate consolidated reports.

The tool breaks down large files into manageable chunks, applies specialized AI agents (e.g., for code, APIs, configs) in a recursive manner, and synthesizes findings into readable Markdown reports and detailed JSON outputs. It's designed for static application security testing (SAST) with a focus on OWASP Top 10, API security, misconfigurations, and more.

Note: This is a POC and relies on the quality of the underlying language model (LLM). Results should be verified by security experts. It does not replace professional security audits.

Features

  • Recursive Analysis: Handles large files by chunking and subdividing content recursively (with configurable depth, chunk sizes, and subdivision factors).
  • Multi-Expert Modules: Uses DSPy signatures for specialized analysis:
    • General Overview: High-level content assessment.
    • Code Module: Scans source code for vulnerabilities (e.g., injection flaws, insecure data handling).
    • API Module: Analyzes API specs (e.g., OpenAPI, Postman) against OWASP API Security Top 10.
    • Configuration Module: Checks configs for misconfigurations (e.g., default credentials).
    • Documentation Module: Reviews docs for security gaps.
    • Threat Modeler: Identifies potential threats and attack vectors.
    • Compliance Checker: Checks against standards like OWASP and data protection principles.
  • Directory Support: Processes entire directories, ignoring specified patterns (e.g., logs, caches).
  • Consolidated Reports: Aggregates findings across files/chunks into a single Markdown report with optional per-file details appended.
  • Output Formats: Markdown for human-readable reports; JSON for detailed, structured results.
  • Content Detection: Automatically detects file types (e.g., Python code, JSON configs, API specs).
  • Customization: Configurable chunking, recursion limits, LLM settings, and more via command-line arguments.

Requirements

  • Python 3.8+ (tested on 3.12.3)
  • DSPy: For defining and running AI modules.
  • LiteLLM: For LLM API interactions (supports OpenAI-compatible models).
  • Other dependencies: argparse, json, os, fnmatch, collections, typing.

Install dependencies:

pip install dspy-ai litellm

You need access to an LLM API (e.g., OpenAI, or any OpenAI-compatible endpoint). Set your API key via environment variable or command-line argument.

Installation

  1. Clone the repository:

  2. Install dependencies:

    pip install -r requirements.txt

    (Create a requirements.txt with dspy-ai and litellm if needed.)

  3. Ensure your LLM API is accessible (e.g., set API_KEY environment variable).

Usage

Run the script with Python:

python appsec_analysis.py <input_path> [options]
  • <input_path>: Path to a file or directory to analyze.

Key Command-Line Arguments

  • --model: LLM model name (default: empty; set to e.g., gpt-4o).
  • --api_base_url: LLM API base URL (default: empty; e.g., https://api.openai.com/v1).
  • --api_key: LLM API key (or use API_KEY env var).
  • --initial_chunk_size: Size of initial chunks (default: 12000 chars).
  • --initial_chunk_overlap: Overlap between chunks (default: 500 chars).
  • --max_chars_no_initial_chunk: Max file size before chunking (default: 600000 chars).
  • --rec_min_chunks_subdivide: Min chunks to subdivide recursively (default: 4).
  • --rec_max_depth: Max recursion depth (default: 3).
  • --rec_max_chunks_leaf: Max chunks per leaf analysis (default: 5).
  • --rec_subdivision_factor: Subdivision factor for recursion (default: 2).
  • --max_output_tokens: Max tokens for LLM responses (default: 60000).
  • --max_consolidator_input_chars: Max chars for findings consolidator input (default: 750000).
  • --output_report_file: Markdown report filename (default: security_analysis_report.md).
  • --output_json_file: JSON results filename (default: security_analysis_details.json).
  • --output_dir: Directory for outputs (default: current dir).
  • --ignore_paths: Patterns to ignore (e.g., *.log __pycache__ temp/*).
  • --append_detailed_reports: Append per-file reports to directory consolidated report.

Examples

  1. Analyze a single file:

    python appsec_analysis.py path/to/example.py --model gpt-4o --api_key your-api-key

    Outputs: security_analysis_report.md and security_analysis_details.json.

  2. Analyze a directory, ignoring certain patterns:

    python appsec_analysis.py path/to/project_dir --model gpt-4o --api_key your-api-key --ignore_paths "*.log" "__pycache__" --append_detailed_reports

    Generates a consolidated report for the directory, with optional per-file details appended.

How It Works

  1. File Reading & Chunking: Reads files, detects content type, and chunks large content.
  2. Recursive Analysis: Subdivides chunks into sections, applies expert modules at leaf nodes.
  3. Expert Modules: Each module (e.g., CodeAnalysisSignature) uses DSPy to generate findings via LLM prompts.
  4. Consolidation: Aggregates raw findings, de-duplicates, and synthesizes into cohesive reports.
  5. Outputs: Markdown for summaries; JSON for full analysis trees and raw data.

Limitations

  • POC Nature: Outputs depend on LLM accuracy; may produce false positives/negatives.
  • No Execution: Static analysis only; no runtime testing.
  • LLM Dependency: Requires a powerful LLM for best results (e.g., GPT-4 or equivalent).
  • Performance: Large directories or deep recursion may be slow/expensive due to LLM calls.
  • Encoding Issues: Handles UTF-8 and Latin-1; may fail on other encodings.
  • No Additional Packages: Code interpreter tool is limited to pre-installed libraries (e.g., numpy, sympy).

Contributing

Contributions are welcome! Open issues for bugs or features, or submit pull requests.

  1. Fork the repo.
  2. Create a branch: git checkout -b feature-branch.
  3. Commit changes: git commit -m "Add feature".
  4. Push: git push origin feature-branch.
  5. Open a PR.

License

This project is licensed under the MIT License. See LICENSE for details.

About

A DSPy powered script for automating first pass analysis on codebases, API specs, and more. Leveraging DSPy to create Gists to support processing of files and codebases that would greatly exceed the context window of a LLM normally.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages