diff --git a/README.md b/README.md index 9cf0e7b..e450a54 100644 --- a/README.md +++ b/README.md @@ -6,32 +6,47 @@ Semantic code analyzer that produces compressed TOON (Text Object-Oriented Notat ```bash cargo build --release +# Binaries land in target/release/ ``` ## Binaries -The project builds three binaries: +The project builds four binaries: | Binary | Purpose | |--------|---------| -| `semfora-engine` | CLI for semantic analysis, indexing, and querying | -| `semfora-engine-server` | MCP server for AI agent integration | -| `semfora-daemon` | WebSocket daemon for real-time updates | +| `semfora-engine` | Main CLI: analysis, indexing, querying, and MCP server | +| `semfora-daemon` | WebSocket daemon for real-time index updates | +| `semfora-benchmark-builder` | Benchmark tooling | +| `semfora-security-compiler` | Security pattern compiler | + +> **Note:** The MCP server is built into `semfora-engine` as the `serve` subcommand. +> There is no separate `semfora-engine-server` binary. ## Usage ```bash # Analyze a single file -semfora-engine path/to/file.rs +semfora-engine analyze path/to/file.rs + +# Analyze a directory +semfora-engine analyze ./src + +# Analyze uncommitted changes +semfora-engine analyze --uncommitted + +# Generate a semantic index for the current project +semfora-engine index generate . -# Analyze a directory and create sharded index -semfora-engine --dir . --shard +# Search for symbols by name +semfora-engine search "authenticate" # Query the index -semfora-engine --search-symbols "authenticate" +semfora-engine query overview +semfora-engine query symbol --help -# Start MCP server (for AI agents) -semfora-engine-server +# Start MCP server (for AI coding assistants) +semfora-engine serve --repo /path/to/project # Start WebSocket daemon (for real-time updates) semfora-daemon @@ -397,10 +412,12 @@ src/ | Document | Description | |----------|-------------| | [Quick Start](docs/quickstart.md) | Get up and running in 5 minutes | -| [CLI Reference](docs/cli.md) | Complete CLI usage, flags, and examples | +| [CLI Reference](docs/cli.md) | Complete CLI usage, subcommands, and examples | | [Features](docs/features.md) | Incremental indexing, layered indexes, risk assessment | +| [MCP Tools Reference](docs/mcp-tools-reference.md) | All MCP tools for AI agent integration | +| [MCP Workflows](docs/mcp-workflows.md) | Common MCP usage patterns | | [WebSocket Daemon](docs/websocket-daemon.md) | Real-time updates, protocol, and query methods | | [Adding Languages](docs/adding-languages.md) | Guide for adding new language support | -| [Engineering](docs/engineering.md) | Implementation details and status | +| [Architecture](docs/architecture.md) | Implementation details and design | ## License diff --git a/docs/cli.md b/docs/cli.md index a19aaf6..6f145e5 100644 --- a/docs/cli.md +++ b/docs/cli.md @@ -1,6 +1,6 @@ # Semfora CLI Reference -The `semfora-engine` CLI is a semantic code analyzer that produces compressed TOON output for AI-assisted code review. +The `semfora-engine` CLI is a semantic code analyzer that produces TOON output for AI-assisted code review. ## Quick Start @@ -9,13 +9,16 @@ The `semfora-engine` CLI is a semantic code analyzer that produces compressed TO cargo build --release # Index a project -semfora-engine --dir /path/to/project --shard +semfora-engine index generate /path/to/project # Search for symbols -semfora-engine --search-symbols "authenticate" +semfora-engine search "authenticate" # Analyze uncommitted changes -semfora-engine --uncommitted +semfora-engine analyze --uncommitted + +# Start MCP server +semfora-engine serve --repo . ``` ## Installation @@ -25,191 +28,549 @@ cargo build --release # Binary: target/release/semfora-engine ``` -## Basic Usage +## Top-Level Usage + +``` +semfora-engine [OPTIONS] + +Commands: + analyze Analyze files, directories, or git changes [alias: a] + search Search for code (hybrid symbol + semantic) [alias: s] + query Query the semantic index [alias: q] + trace Trace symbol usage across the call graph + validate Run quality audits (complexity, duplicates) [alias: v] + index Manage the semantic index + cache Manage the cache + test Run or detect tests + lint Run linters [alias: l] + commit Prepare information for a commit message + setup Setup semfora-engine and MCP client configuration + uninstall Uninstall semfora-engine or MCP configurations + config Manage semfora-engine configuration + benchmark Run token efficiency benchmark + serve Start the MCP server (for AI coding assistants) + help Print help + +Global Options: + -f, --format Output format: text (default), toon, json + -v, --verbose Show verbose output + --progress Show progress percentage + -h, --help Print help + -V, --version Print version +``` + +--- + +## `analyze` — Analyze Code + +Analyze files, directories, or git changes. + +``` +semfora-engine analyze [OPTIONS] [PATH] +``` + +### Arguments + +| Argument | Description | +|----------|-------------| +| `[PATH]` | Path to file or directory to analyze | + +### Options + +| Option | Description | +|--------|-------------| +| `--diff []` | Analyze git diff (auto-detects main/master if no ref given) | +| `--uncommitted` | Analyze uncommitted changes (working dir vs HEAD) | +| `--commit ` | Analyze a specific commit | +| `--all-commits` | Analyze all commits on current branch since base | +| `--base ` | Base branch for diff comparison | +| `--target-ref ` | Target ref (defaults to HEAD; use `WORKING` for uncommitted) | +| `--limit ` | Max files to show in diff output (pagination) | +| `--offset ` | Offset for diff pagination | +| `--max-depth ` | Max directory depth (default: 10) | +| `--ext ` | Filter by extension (repeatable: `--ext rs --ext ts`) | +| `--allow-tests` | Include test files (excluded by default) | +| `--summary-only` | Show summary statistics only | +| `--start-line ` | Start line for focused analysis (file mode only) | +| `--end-line ` | End line for focused analysis (file mode only) | +| `--output-mode ` | `full` (default), `symbols_only`, or `summary` | +| `--print-ast` | Print parsed AST (debugging) | +| `--analyze-tokens ` | Token analysis: `full` or `compact` | +| `--compare-compact` | Include compact JSON in token analysis | +| `--shard` | Generate sharded index (legacy flag, prefer `index generate`) | +| `--incremental` | Incremental indexing (legacy flag, prefer `index generate --incremental`) | + +### Examples ```bash -# Analyze a single file -semfora-engine path/to/file.rs +# Single file +semfora-engine analyze path/to/file.rs -# Analyze a directory (recursive) -semfora-engine --dir path/to/project +# Directory +semfora-engine analyze ./src -# Analyze uncommitted changes -semfora-engine --uncommitted +# Uncommitted changes +semfora-engine analyze --uncommitted + +# Diff against main +semfora-engine analyze --diff main + +# Diff with summary only +semfora-engine analyze --diff origin/main --summary-only + +# Specific commit +semfora-engine analyze --commit abc123 + +# Focused line range +semfora-engine analyze ./src/big_file.rs --start-line 100 --end-line 250 + +# JSON output +semfora-engine analyze path/to/file.rs --format json +``` + +--- + +## `search` — Search Code + +Hybrid search across symbol names and code semantics. + +``` +semfora-engine search [OPTIONS] +``` + +### Arguments + +| Argument | Description | +|----------|-------------| +| `` | Search query (required) | + +### Options + +| Option | Description | +|--------|-------------| +| `-s, --symbols` | Only show exact symbol name matches | +| `-r, --related` | Only show semantically related code | +| `--raw` | Raw regex search (for comments, strings, patterns) | +| `--kind ` | Filter by symbol kind (fn, struct, component, etc.) | +| `--module ` | Filter by module name | +| `--risk ` | Filter by risk level: high, medium, low | +| `--include-source` | Include source code snippets in output | +| `--limit ` | Max results (default: 20) | +| `--file-types ` | File types for raw search (e.g., `rs,ts,py`) | +| `--case-sensitive` | Case-sensitive search | +| `--symbol-scope ` | `functions` (default), `variables`, or `both` | +| `--include-escape-refs` | Include local variables that escape scope | + +### Examples + +```bash +# Hybrid search (default) +semfora-engine search "authenticate" + +# Symbol names only +semfora-engine search "handleRequest" --symbols + +# Semantic only +semfora-engine search "user login flow" --related + +# Filter by kind and risk +semfora-engine search "process" --kind fn --risk high + +# Search in a specific module +semfora-engine search "login" --module auth + +# Raw regex search +semfora-engine search "TODO|FIXME" --raw + +# Include variables in results +semfora-engine search "config" --symbol-scope both +``` + +--- -# Diff against main branch -semfora-engine --diff +## `query` — Query the Index + +Query the semantic index for symbols, source, callers, etc. + +``` +semfora-engine query ``` -## Operation Modes +### Subcommands -### Single File Analysis +#### `query overview` + +Get repository overview. ```bash -semfora-engine path/to/file.rs -semfora-engine path/to/file.ts --format json +semfora-engine query overview +semfora-engine query overview --modules # Include full module list +semfora-engine query overview --max-modules 50 # Limit modules shown ``` -### Directory Analysis +#### `query module ` + +Get details for a specific module. ```bash -# Analyze all files in a directory -semfora-engine --dir ./src +semfora-engine query module src.commands +semfora-engine query module auth --format json +``` + +#### `query symbol` + +Get a symbol by hash or file+line location. -# Limit recursion depth (default: 10) -semfora-engine --dir ./src --max-depth 5 +```bash +semfora-engine query symbol abc123def456 +semfora-engine query symbol --file-path ./src/main.rs --line 42 +``` -# Filter by file extension -semfora-engine --dir ./src --ext rs --ext ts +#### `query source` -# Include test files (excluded by default) -semfora-engine --dir ./src --allow-tests +Get source code for a file or symbol(s). -# Summary statistics only -semfora-engine --dir ./src --summary-only +```bash +semfora-engine query source abc123def456 +semfora-engine query source --file-path ./src/main.rs --start-line 10 --end-line 50 ``` -### Git Diff Analysis +#### `query callers ` + +Find what calls a symbol (reverse call graph). ```bash -# Diff against auto-detected base branch (main/master) -semfora-engine --diff +semfora-engine query callers abc123def456 +semfora-engine query callers abc123def456 --depth 3 +``` -# Diff against a specific branch -semfora-engine --diff develop +#### `query callgraph` -# Explicit base branch -semfora-engine --diff --base origin/main +Get the repository call graph. + +```bash +semfora-engine query callgraph +semfora-engine query callgraph --format json +``` -# Analyze uncommitted changes (working directory vs HEAD) -semfora-engine --uncommitted +#### `query file ` -# Analyze a specific commit -semfora-engine --commit abc123 +List all symbols in a file. -# Analyze all commits since base branch -semfora-engine --commits +```bash +semfora-engine query file ./src/main.rs +semfora-engine query file ./src/commands/index.rs ``` -## Output Formats +#### `query languages` + +List all supported languages. ```bash -# TOON format (default) - token-efficient for AI consumption -semfora-engine file.rs --format toon +semfora-engine query languages +``` -# JSON format - standard structured output -semfora-engine file.rs --format json +--- -# Verbose output with AST info -semfora-engine file.rs --verbose +## `trace` — Trace Symbol Usage -# Print parsed AST (debugging) -semfora-engine file.rs --print-ast +Trace a symbol through the call graph in either direction. + +``` +semfora-engine trace [OPTIONS] ``` -## Sharded Indexing +### Arguments + +| Argument | Description | +|----------|-------------| +| `` | Symbol hash or name to trace | -For large repositories, create a sharded index for fast querying: +### Options -### Generate Index +| Option | Description | +|--------|-------------| +| `--kind ` | Target kind (function, variable, component, module, file, etc.) | +| `--direction ` | `incoming`, `outgoing`, or `both` (default: `both`) | +| `--depth ` | Max depth to traverse (default: 2) | +| `--limit ` | Max edges to return (default: 200) | +| `--offset ` | Pagination offset | +| `--include-escape-refs` | Include local variables that escape scope | +| `--include-external` | Include external nodes (ext:*) | +| `--path ` | Repository path | + +### Examples ```bash -# Generate sharded index (writes to ~/.cache/semfora-engine/) -semfora-engine --dir . --shard +# Trace all directions +semfora-engine trace abc123def456 + +# Incoming only (who calls this?) +semfora-engine trace abc123def456 --direction incoming + +# Outgoing only (what does this call?) +semfora-engine trace abc123def456 --direction outgoing --depth 3 + +# Trace by name +semfora-engine trace "authenticate" +``` + +--- + +## `validate` — Quality Audits + +Run complexity and duplicate code analysis. + +``` +semfora-engine validate [OPTIONS] [TARGET] +``` + +### Arguments + +| Argument | Description | +|----------|-------------| +| `[TARGET]` | File path, module name, or symbol hash (auto-detected) | + +### Options + +| Option | Description | +|--------|-------------| +| `--path ` | Repository path | +| `--symbol-hash ` | Symbol hash for single-symbol validation | +| `--file-path ` | File path for file-level validation | +| `--line ` | Line number (requires `--file-path`) | +| `--module ` | Module name for module-level validation | +| `--include-source` | Include source snippet in output | +| `--duplicates` | Find duplicate code patterns | +| `--threshold ` | Similarity threshold (default: 0.90) | +| `--include-boilerplate` | Include boilerplate in duplicate detection | +| `--kind ` | Filter by symbol kind | +| `--symbol-scope ` | `functions` (default), `variables`, or `both` | +| `--limit ` | Max clusters (default: 50) | +| `--offset ` | Pagination offset | +| `--min-lines ` | Min function lines to include (default: 3) | +| `--sort-by ` | Sort by: `similarity` (default), `size`, or `count` | + +### Examples + +```bash +# Validate a module (get module name from query overview first) +semfora-engine validate src.commands + +# Validate a specific file +semfora-engine validate --file-path ./src/main.rs + +# Find duplicates +semfora-engine validate --duplicates -# Incremental indexing (only re-index changed files) -semfora-engine --dir . --shard --incremental +# Lower threshold to find more similar code +semfora-engine validate --duplicates --threshold 0.75 -# Filter extensions during indexing -semfora-engine --dir . --shard --ext ts --ext tsx +# Validate a specific symbol +semfora-engine validate --symbol-hash abc123 ``` -### Query Index +--- + +## `index` — Manage the Index + +``` +semfora-engine index +``` + +### `index generate [PATH]` + +Generate or refresh the semantic index. ```bash -# Get repository overview -semfora-engine --get-overview +# Index current directory +semfora-engine index generate . -# List all modules in the index -semfora-engine --list-modules +# Index a specific path +semfora-engine index generate /path/to/project -# Get a specific module's symbols -semfora-engine --get-module api +# With progress output +semfora-engine index generate . --progress -# Search for symbols by name -semfora-engine --search-symbols "login" +# Force full re-index +semfora-engine index generate . --force -# List all symbols in a module -semfora-engine --list-symbols auth +# Incremental (only changed files) +semfora-engine index generate . --incremental -# Get a specific symbol by hash -semfora-engine --get-symbol abc123def456 +# Filter by extension +semfora-engine index generate . --ext rs --ext ts -# Get the call graph -semfora-engine --get-call-graph +# Limit depth +semfora-engine index generate . --max-depth 5 ``` -### Query Filtering +### `index check` + +Check if the index is fresh or stale. ```bash -# Filter by symbol kind -semfora-engine --search-symbols "handle" --kind fn +semfora-engine index check +``` -# Filter by risk level -semfora-engine --list-symbols api --risk high +### `index export` -# Limit results (default: 50) -semfora-engine --search-symbols "test" --limit 20 +Export the index to SQLite. + +```bash +semfora-engine index export +semfora-engine index export --output ./my_index.db ``` -## Cache Management +--- + +## `cache` — Manage the Cache + +``` +semfora-engine cache +``` ```bash -# Show cache information -semfora-engine --cache-info +# Show cache info +semfora-engine cache info # Clear cache for current directory -semfora-engine --cache-clear +semfora-engine cache clear + +# Prune caches older than 30 days +semfora-engine cache prune 30 +``` + +--- + +## `serve` — Start the MCP Server -# Prune caches older than N days -semfora-engine --cache-prune 30 +Start the MCP server for AI coding assistants. Communicates via stdio. + +``` +semfora-engine serve [OPTIONS] ``` -## Static Analysis +### Options + +| Option | Description | +|--------|-------------| +| `-r, --repo ` | Repository path to serve (default: current directory) | +| `--no-watch` | Disable file watcher for live index updates | +| `--no-git-poll` | Disable git polling for branch/commit changes | + +### Examples ```bash -# Run static code analysis on the index -semfora-engine --analyze +# Serve current directory +semfora-engine serve + +# Serve a specific repo +semfora-engine serve --repo /path/to/project + +# Without file watching (useful for CI/testing) +semfora-engine serve --repo . --no-watch --no-git-poll +``` + +### MCP Client Configuration + +**Claude Desktop** (`~/Library/Application Support/Claude/claude_desktop_config.json` on macOS): + +```json +{ + "mcpServers": { + "semfora-engine": { + "type": "stdio", + "command": "/path/to/semfora-engine", + "args": ["serve", "--repo", "/path/to/your/project"] + } + } +} +``` + +**Cursor / VS Code / Other MCP clients:** + +```json +{ + "mcpServers": { + "semfora-engine": { + "command": "/path/to/semfora-engine", + "args": ["serve", "--repo", "/path/to/your/project"] + } + } +} +``` + +--- + +## `lint` — Run Linters + +Auto-detects and runs available linters for the project. + +```bash +# Detect available linters without running +semfora-engine lint --detect-only + +# Run linters +semfora-engine lint + +# Run in fix mode +semfora-engine lint --mode fix --safe-only + +# Run a specific linter +semfora-engine lint --linter clippy -# Analyze a specific module only -semfora-engine --analyze --analyze-module api +# Typecheck only +semfora-engine lint --mode typecheck ``` -## Token Analysis +--- -Analyze token efficiency of TOON compression: +## `test` — Run Tests ```bash -# Full detailed report -semfora-engine file.rs --analyze-tokens full +# Run tests +semfora-engine test -# Compact single-line summary -semfora-engine file.rs --analyze-tokens compact +# Discover tests without running +semfora-engine test --discover-only -# Include compact JSON comparison -semfora-engine file.rs --analyze-tokens full --compare-compact +# Run tests in a specific path +semfora-engine test ./tests/integration ``` -## Benchmarking +--- + +## `commit` — Prepare Commit Context + +Gather semantic context for writing a commit message. + +```bash +semfora-engine commit +``` + +--- + +## Output Formats + +All commands support `--format`: + +| Format | Description | +|--------|-------------| +| `text` | Human-readable with visual formatting (default for terminal) | +| `toon` | TOON format — token-efficient for AI consumption | +| `json` | Standard JSON | ```bash -# Run token efficiency benchmark -semfora-engine --benchmark +semfora-engine query overview --format json +semfora-engine search "authenticate" --format toon ``` +--- + ## Test File Exclusion -By default, test files are excluded from analysis. Test patterns by language: +By default, test files are excluded. Use `--allow-tests` to include them. | Language | Excluded Patterns | |----------|-------------------| @@ -219,67 +580,70 @@ By default, test files are excluded from analysis. Test patterns by language: | Go | `*_test.go` | | Java | `*Test.java`, `*Tests.java` | -Use `--allow-tests` to include test files. +--- -## Directory for Index Queries +## Environment Variables -When using query commands (`--get-overview`, `--search-symbols`, etc.), the CLI uses the cache for the current working directory. The cache location is determined by the git remote URL hash for reproducibility. +| Variable | Description | +|----------|-------------| +| `RUST_LOG` | Logging verbosity (e.g., `RUST_LOG=semfora_engine=debug`) | -## Examples +--- -### Typical Workflow +## Exit Codes + +| Code | Meaning | +|------|---------| +| 0 | Success | +| 1 | File not found or IO error | +| 2 | Unsupported language | +| 3 | Parse failure | +| 4 | Semantic extraction or query error | +| 5 | Git error (not a git repo, etc.) | + +--- + +## Typical Workflows + +### Full Project Analysis ```bash -# 1. Generate index for a project +# 1. Generate index cd my-project -semfora-engine --dir . --shard +semfora-engine index generate . --progress -# 2. Get project overview -semfora-engine --get-overview +# 2. Get architecture overview +semfora-engine query overview # 3. Search for specific functionality -semfora-engine --search-symbols "authenticate" --kind fn +semfora-engine search "authenticate" --kind fn -# 4. Get details on a symbol -semfora-engine --get-symbol abc123def456 +# 4. Trace a symbol +semfora-engine trace --direction incoming -# 5. Analyze changes before commit -semfora-engine --uncommitted +# 5. Validate code quality +semfora-engine validate src.api -# 6. Analyze feature branch diff -semfora-engine --diff main +# 6. Check for duplicates +semfora-engine validate --duplicates ``` -### Code Review Workflow +### Code Review ```bash # Analyze PR changes -semfora-engine --diff origin/main +semfora-engine analyze --diff origin/main # Focus on specific file types -semfora-engine --diff origin/main --ext ts --ext tsx +semfora-engine analyze --diff origin/main --ext ts --ext tsx -# Get summary only -semfora-engine --diff origin/main --summary-only +# Summary only (fast overview) +semfora-engine analyze --diff origin/main --summary-only ``` -## Environment Variables - -- `RUST_LOG`: Control logging verbosity (e.g., `RUST_LOG=semfora_engine=debug`) - -## Exit Codes - -| Code | Meaning | -|------|---------| -| 0 | Success | -| 1 | File not found or IO error | -| 2 | Unsupported language | -| 3 | Parse failure | -| 4 | Semantic extraction or query error | -| 5 | Git error (not a git repo, etc.) | - ## See Also -- [Features](features.md) - Incremental indexing, layered indexes, risk assessment -- [WebSocket Daemon](websocket-daemon.md) - Real-time index updates via WebSocket -- [Main README](../README.md) - Supported languages and architecture +- [Quick Start](quickstart.md) — Get up and running in 5 minutes +- [Features](features.md) — Incremental indexing, layered indexes, risk assessment +- [MCP Tools Reference](mcp-tools-reference.md) — All 18 MCP tools with parameters +- [WebSocket Daemon](websocket-daemon.md) — Real-time updates via WebSocket diff --git a/docs/quickstart.md b/docs/quickstart.md index 4ebc11a..0733e70 100644 --- a/docs/quickstart.md +++ b/docs/quickstart.md @@ -2,11 +2,17 @@ Get up and running with Semfora Engine in 5 minutes. +## Prerequisites + +- [Rust toolchain](https://rustup.rs) (1.70+) +- A C compiler (`gcc`, `clang`, or [Zig](https://ziglang.org) as a drop-in) +- Git + ## Installation ```bash # Clone the repository -git clone https://github.com/Semfora-org/semfora-engine.git +git clone https://github.com/Semfora-AI/semfora-engine.git cd semfora-engine # Build release binaries @@ -16,90 +22,157 @@ cargo build --release export PATH="$PATH:$(pwd)/target/release" ``` -The build produces three binaries in `target/release/`: +The build produces these binaries in `target/release/`: | Binary | Purpose | |--------|---------| -| `semfora-engine` | CLI for semantic analysis, indexing, and querying | -| `semfora-engine-server` | MCP server for AI agent integration | -| `semfora-daemon` | WebSocket daemon for real-time updates | +| `semfora-engine` | Main CLI: analysis, indexing, querying, and MCP server | +| `semfora-daemon` | WebSocket daemon for real-time index updates | +| `semfora-benchmark-builder` | Benchmark tooling | +| `semfora-security-compiler` | Security pattern compiler | -## CLI Usage +> **Note:** There is no separate `semfora-engine-server` binary. The MCP server +> is built into `semfora-engine` via the `serve` subcommand. -### Step 1: Index a Repository +## Step 1: Index a Repository -Navigate to any git repository and create an index: +Navigate to any git repository and create a semantic index: ```bash cd /path/to/your/project -# Generate sharded index -semfora-engine --dir . --shard +# Generate index (writes to ~/.cache/semfora/) +semfora-engine index generate . + +# With progress output +semfora-engine index generate . --progress + +# Incremental: only re-index changed files +semfora-engine index generate . --incremental ``` -This creates a semantic index in `~/.cache/semfora/` with: +This creates a semantic index with: - Repository overview - Per-module symbol data - Call graph relationships -- Symbol lookup index +- BM25 full-text index for hybrid search -### Step 2: Search for Code - -Once indexed, you can search: +## Step 2: Search for Code ```bash -# Search for symbols by name -semfora-engine --search-symbols "authenticate" +# Hybrid search (symbol + semantic, default) +semfora-engine search "authenticate" + +# Symbol name matches only +semfora-engine search "handleRequest" --symbols + +# Semantic / related code only +semfora-engine search "user authentication" --related -# Filter by symbol type -semfora-engine --search-symbols "handle" --kind fn +# Filter by symbol kind +semfora-engine search "handle" --kind fn # Filter by risk level -semfora-engine --search-symbols "process" --risk high +semfora-engine search "process" --risk high -# List symbols in a specific module -semfora-engine --list-symbols api +# Filter by module +semfora-engine search "login" --module auth ``` -### Step 3: Get Repository Overview +## Step 3: Query the Index ```bash # High-level architecture summary -semfora-engine --get-overview +semfora-engine query overview -# List all modules -semfora-engine --list-modules +# Get a specific module's details +semfora-engine query module -# Get call graph -semfora-engine --get-call-graph +# Get a specific symbol by hash +semfora-engine query symbol + +# Get source code for a symbol +semfora-engine query source + +# Find callers of a symbol (reverse call graph) +semfora-engine query callers + +# Get the full call graph +semfora-engine query callgraph + +# List all symbols in a file +semfora-engine query file ./src/main.rs + +# List supported languages +semfora-engine query languages ``` -### Step 4: Analyze Changes +## Step 4: Analyze Code ```bash -# Analyze uncommitted changes -semfora-engine --uncommitted +# Analyze a single file +semfora-engine analyze path/to/file.rs + +# Analyze a directory +semfora-engine analyze ./src + +# Analyze uncommitted changes (working directory vs HEAD) +semfora-engine analyze --uncommitted # Diff against main branch -semfora-engine --diff main +semfora-engine analyze --diff main -# Analyze a specific file -semfora-engine path/to/file.rs +# Diff against a specific commit +semfora-engine analyze --diff origin/main + +# Analyze a specific commit +semfora-engine analyze --commit abc123 +``` + +## Step 5: Validate Code Quality + +```bash +# Validate a module (get module name from query overview first) +semfora-engine validate + +# Validate a specific file +semfora-engine validate --file-path ./src/main.rs + +# Find duplicate code across the codebase +semfora-engine validate --duplicates + +# Validate a specific symbol +semfora-engine validate --symbol-hash ``` ## MCP Server for AI Agents -### Setting Up with Claude Code +The MCP server runs as a subcommand and communicates via stdio (standard for MCP): + +### Starting the Server + +```bash +# Serve current directory +semfora-engine serve + +# Serve a specific repository +semfora-engine serve --repo /path/to/project + +# Without file watching (useful for CI/testing) +semfora-engine serve --repo . --no-watch --no-git-poll +``` + +### Configuring with Claude Desktop -1. Add to your Claude Code MCP configuration (`~/.config/claude/claude_desktop_config.json`): +Add to your Claude Desktop MCP config (`~/Library/Application Support/Claude/claude_desktop_config.json` on macOS): ```json { "mcpServers": { "semfora-engine": { "type": "stdio", - "command": "/path/to/semfora-engine/target/release/semfora-engine-server", - "args": ["--repo", "/path/to/your/project"], + "command": "/path/to/semfora-engine/target/release/semfora-engine", + "args": ["serve", "--repo", "/path/to/your/project"], "env": { "RUST_LOG": "semfora_engine=info" } @@ -108,111 +181,150 @@ semfora-engine path/to/file.rs } ``` -2. Restart Claude Code. The AI will now have access to semantic code analysis tools. +### Configuring with Other MCP Clients (VS Code, Cursor, etc.) -### MCP Server Options - -```bash -# Start server for current directory -semfora-engine-server - -# Start server for a specific repository -semfora-engine-server --repo /path/to/project +```json +{ + "mcpServers": { + "semfora-engine": { + "command": "/path/to/semfora-engine/target/release/semfora-engine", + "args": ["serve", "--repo", "/path/to/your/project"] + } + } +} ``` ### Available MCP Tools -Once connected, the AI has access to: +Once connected, the AI has access to 18 tools: | Tool | Description | |------|-------------| -| `generate_index` | Create/update semantic index | -| `get_repo_overview` | Get architecture summary | -| `search_symbols` | Find symbols by name | -| `get_symbol` | Get detailed symbol info | -| `get_symbol_source` | Get source code for a symbol | -| `analyze_diff` | Analyze git changes | -| `run_tests` | Run project tests | +| `get_context` | Git context and index status (~200 tokens; use first) | +| `get_overview` | Repository architecture overview | +| `search` | Hybrid symbol + semantic search | +| `analyze` | Semantic analysis of file, directory, or module | +| `analyze_diff` | Analyze git changes / PR diffs | +| `get_file` | List symbols in a file | +| `get_symbol` | Detailed symbol information | +| `get_source` | Source code for symbol(s) or line range | +| `get_callers` | Reverse call graph (impact analysis) | +| `get_callgraph` | Full call graph | +| `validate` | Quality audit (complexity, duplicates) | +| `find_duplicates` | Duplicate code detection | +| `index` | Refresh/check the semantic index | +| `test` | Run or discover tests | +| `lint` | Run linters (auto-detects available tools) | +| `server_status` | Server status and layer info | +| `prep_commit` | Prepare commit message context | +| `get_languages` | List supported languages | + +See [MCP Tools Reference](mcp-tools-reference.md) for full parameter documentation. + +## Cache Management -## WebSocket Daemon (Advanced) +```bash +# Show cache info +semfora-engine cache info -For real-time index updates and multi-client support: +# Clear cache for current directory +semfora-engine cache clear + +# Prune old caches (older than 30 days) +semfora-engine cache prune 30 +``` + +## Output Formats + +All commands support `--format`: ```bash -# Start the daemon -semfora-daemon --port 9847 +# Default: human-readable text +semfora-engine query overview -# Connect via WebSocket client and send: -# {"type": "connect", "directory": "/path/to/project"} -``` +# TOON format (token-efficient for AI consumption) +semfora-engine query overview --format toon -See [WebSocket Daemon](websocket-daemon.md) for full protocol documentation. +# JSON format +semfora-engine query overview --format json +``` ## Common Workflows ### Code Review ```bash -# 1. Index the repository -semfora-engine --dir . --shard +# 1. Index the repository (if not already done) +semfora-engine index generate . # 2. Analyze the PR diff -semfora-engine --diff origin/main +semfora-engine analyze --diff origin/main # 3. Find high-risk changes -semfora-engine --search-symbols "*" --risk high +semfora-engine search "process" --risk high ``` ### Codebase Exploration ```bash # 1. Get overview -semfora-engine --get-overview +semfora-engine query overview -# 2. List modules -semfora-engine --list-modules +# 2. Explore a specific module +semfora-engine query module src.api -# 3. Explore a specific module -semfora-engine --list-symbols components +# 3. Search for functionality +semfora-engine search "authentication" # 4. Get details on a symbol -semfora-engine --get-symbol +semfora-engine query symbol + +# 5. Find what calls it +semfora-engine query callers ``` -### Incremental Updates +### Tracing Symbol Usage ```bash -# Initial full index -semfora-engine --dir . --shard +# Trace a symbol through the call graph (incoming + outgoing) +semfora-engine trace + +# Only incoming calls +semfora-engine trace --direction incoming -# Later: incremental update (only changed files) -semfora-engine --dir . --shard --incremental +# Only outgoing calls, 3 levels deep +semfora-engine trace --direction outgoing --depth 3 ``` ## Troubleshooting ### "No index found" -Run `semfora-engine --dir . --shard` to create an index first. +```bash +semfora-engine index generate . +``` -### Stale index +### Index is stale -Run `semfora-engine --dir . --shard --incremental` to update. +```bash +semfora-engine index generate . --incremental +``` -### View cache info +### Check index freshness ```bash -semfora-engine --cache-info +semfora-engine index check ``` -### Clear cache +### View cache info ```bash -semfora-engine --cache-clear +semfora-engine cache info ``` ## Next Steps -- [CLI Reference](cli.md) - Full command documentation -- [Features](features.md) - Incremental indexing, layered indexes, risk assessment -- [Adding Languages](adding-languages.md) - Extend language support +- [CLI Reference](cli.md) — Full command and option documentation +- [MCP Tools Reference](mcp-tools-reference.md) — All 18 MCP tools with parameters +- [Features](features.md) — Incremental indexing, layered indexes, risk assessment +- [Adding Languages](adding-languages.md) — Extend language support