Semantic code analyzer that produces compressed TOON (Text Object-Oriented Notation) output for AI-assisted code review. Extracts symbols, dependencies, control flow, state changes, and risk assessments from source files.
→ Quick Start — build, index, and connect your AI agent in under 5 minutes.
cargo build --release
# Binaries land in target/release/The project builds four binaries:
| Binary | Purpose |
|---|---|
semfora-engine |
Main CLI: analysis, indexing, querying, and MCP server |
semfora-daemon |
WebSocket daemon for real-time index updates |
semfora-benchmark-builder |
Benchmark tooling |
semfora-security-compiler |
Security pattern compiler |
Note: The MCP server is built into
semfora-engineas theservesubcommand. There is no separatesemfora-engine-serverbinary.
# Analyze a single file
semfora-engine analyze path/to/file.rs
# Analyze a directory
semfora-engine analyze ./src
# Analyze uncommitted changes
semfora-engine analyze --uncommitted
# Generate a semantic index for the current project
semfora-engine index generate .
# Search for symbols by name
semfora-engine search "authenticate"
# Query the index
semfora-engine query overview
semfora-engine query symbol --help
# Start MCP server (for AI coding assistants)
semfora-engine serve --repo /path/to/project
# Start WebSocket daemon (for real-time updates)
semfora-daemonSee CLI Reference for full documentation.
| Language | Extensions | Family | Implementation Details |
|---|---|---|---|
| TypeScript | .ts, .mts, .cts |
JavaScript | Full AST extraction via tree-sitter-typescript; exports, interfaces, enums, decorators |
| TSX | .tsx |
JavaScript | TypeScript + JSX/React component detection, hooks, styled-components |
| JavaScript | .js, .mjs, .cjs |
JavaScript | Functions, classes, imports; framework detection for React, Express, Angular |
| JSX | .jsx |
JavaScript | JavaScript + JSX component detection |
| Rust | .rs |
Rust | Functions, structs, traits, enums; pub visibility detection via tree-sitter-rust |
| Python | .py, .pyi |
Python | Functions, classes, decorators; underscore-prefix privacy convention |
| Go | .go |
Go | Functions, methods, structs; uppercase-export convention via tree-sitter-go |
| Java | .java |
Java | Classes, interfaces, enums, methods; visibility modifiers |
| Kotlin | .kt, .kts |
Kotlin | Classes, functions, objects; visibility modifiers via tree-sitter-kotlin-ng |
| C | .c, .h |
C Family | Functions, structs, enums; macro and extern detection via tree-sitter-c |
| C++ | .cpp, .cc, .cxx, .hpp, .hxx, .hh |
C Family | Classes, templates, RAII patterns via tree-sitter-cpp |
| Assembly (Generic) | .s, .asm, .S |
Low-level | Instruction blocks, labels, directives via tree-sitter-asm |
| Shell / Bash | .sh, .bash, .zsh, .fish |
Shell | Functions, variable assignments, command invocations via tree-sitter-bash |
| Gradle (Groovy) | .gradle |
JVM Build | Groovy-based build files via tree-sitter-groovy |
These are critical for large C, C++, embedded, and retro-console codebases.
| Language / Format | Extensions | Purpose | Implementation Details |
|---|---|---|---|
| Makefile | Makefile, .mk |
Build system | Target graph, recipes, variables via tree-sitter-make |
| CMake | CMakeLists.txt, .cmake |
Build system | Target definitions, dependencies via tree-sitter-cmake |
| GNU Linker Scripts | .ld |
Toolchain | Structural parsing only (no semantic pass yet) |
| GCC Attributes & Pragmas | inline in C/C++ | Compiler control | Parsed as part of C/C++ AST |
| Framework | Detection Method | Extracted Information | Status |
|---|---|---|---|
| React | Import from react |
Components, hooks, forwardRef, memo | ✅ Done |
| Next.js | /app/, /pages/ patterns |
API routes, layouts, server/client components | ✅ Done |
| Express | Import from express |
Route handlers, middleware | ✅ Done |
| Angular | Decorators (@Component) |
Components, services, modules | ✅ Done |
| Vue | .vue files |
SFC script extraction, Composition API | ✅ Done |
| NestJS | Decorators + bootstrap | Controllers, modules, providers | ✅ Done |
| Koa | Router + app.use |
Route handlers, middleware | ☐ Planned |
| Fastify | fastify.METHOD, hooks |
Route handlers, lifecycle hooks | ☐ Planned |
| Hapi | server.route + lifecycle hooks |
Route handlers, request lifecycle | ☐ Planned |
| Sails / Adonis | Controller/action patterns | Route actions, policies | ☐ Planned |
| Remix | Route module exports | loader, action, default |
☐ Planned |
| Astro | Route files + endpoints | SSR routes, API handlers | ☐ Planned |
| SvelteKit | +page/+layout/+server files |
load, actions, endpoints |
☐ Planned |
| Nuxt | pages/, server/api/, plugins |
Routes, middleware, modules | ☐ Planned |
| Serverless (Vercel/Netlify/AWS) | Handler exports | Serverless entry handlers | ☐ Planned |
| Cloudflare Workers | fetch/scheduled handlers |
Worker entry points | ☐ Planned |
| Socket.io / ws | Connection + event handlers | Realtime entry handlers | ☐ Planned |
| GraphQL (Apollo/Yoga/Helix) | Resolver map exports | Resolvers, schema bindings | ☐ Planned |
| Tooling (Vite/Webpack/Rollup/Babel) | Config + plugin hooks | Build entry + plugin hooks | ☐ Planned |
| CLI (Commander/Yargs/Oclif) | Command registration | CLI command handlers | ☐ Planned |
| Framework | Detection Method | Extracted Information | Status |
|---|---|---|---|
| ASP.NET Core MVC | Attributes + controller base | Controller actions | ☐ Planned |
| ASP.NET Minimal APIs | MapGet/MapPost handlers |
Route handlers | ☐ Planned |
| Razor Pages | PageModel handlers |
Page lifecycle methods | ☐ Planned |
| Blazor | @page directives |
Routed components | ☐ Planned |
| gRPC | Service base classes | RPC handlers | ☐ Planned |
| Azure Functions | [FunctionName] attributes |
Function handlers | ☐ Planned |
| Unity | MonoBehaviour lifecycle | Start, Update, Awake |
☐ Planned |
| Godot (C#) | Node lifecycle methods | _Ready, _Process, _PhysicsProcess |
☐ Planned |
| MAUI / Xamarin | App lifecycle | App entry + page routes | ☐ Planned |
| Framework | Detection Method | Extracted Information | Status |
|---|---|---|---|
| Django | URL + view patterns | Views, URL routes | ☐ Planned |
| Flask | @app.route decorators |
Route handlers | ☐ Planned |
| FastAPI | @app.get/post decorators |
Route handlers, DI | ☐ Planned |
| Celery / RQ | Task decorators | Task entry points | ☐ Planned |
| Click / Typer | CLI decorators | Command handlers | ☐ Planned |
| Airflow | DAG declarations | Workflow entry points | ☐ Planned |
| Framework | Detection Method | Extracted Information | Status |
|---|---|---|---|
| net/http | Handler registration | Route handlers | ☐ Planned |
| Gin/Echo/Fiber/Chi | Router registration | Route handlers | ☐ Planned |
| gRPC | Service impls | RPC handlers | ☐ Planned |
| Cobra | Command registration | CLI command handlers | ☐ Planned |
| Framework | Detection Method | Extracted Information | Status |
|---|---|---|---|
| Spring Boot / MVC | Annotations | Controllers, routes | ☐ Planned |
| Micronaut / Quarkus | Annotations + DI | Controllers, beans | ☐ Planned |
| JAX-RS / Jakarta EE | Annotations | Resource handlers | ☐ Planned |
| Android (Java/Kotlin) | App lifecycle + manifest | Activities, services | ☐ Planned |
| Ktor | Routing blocks | Route handlers | ☐ Planned |
| Jetpack Compose | @Composable |
UI entry points | ☐ Planned |
| Framework | Detection Method | Extracted Information | Status |
|---|---|---|---|
| Actix/Axum/Rocket/Warp | Route macros | Route handlers | ☐ Planned |
| Tonic | Service trait impls | RPC handlers | ☐ Planned |
| Bevy | System registration | Game systems + app entry | ☐ Planned |
| Framework / Domain | Detection Method | Extracted Information | Status |
|---|---|---|---|
| Unreal Engine | Reflection macros | Gameplay classes, module entry | ☐ Planned |
| SDL/GLFW/Qt | App init + event loop | Application entry | ☐ Planned |
| Embedded / RTOS | ISR naming + startup code | Interrupt handlers | ☐ Planned |
| Framework / Domain | Detection Method | Extracted Information | Status |
|---|---|---|---|
| SwiftUI | @main app + Scene |
App entry + scene graph | ☐ Planned |
| Vapor | Route registration | Route handlers | ☐ Planned |
| Laravel/Symfony | Controller/routes | Web entry points | ☐ Planned |
| WordPress | Hook/action patterns | Plugin entry points | ☐ Planned |
| Odin | package main, proc main |
Language entry points | ☐ Planned |
| Dreamcast/KOS | main, init routines |
Boot sequence + subsystem entry points | ☐ Planned |
| Language | Extensions | Implementation Details |
|---|---|---|
| HTML | .html, .htm |
DOM structure via tree-sitter-html |
| CSS | .css |
Stylesheet structure via tree-sitter-css |
| SCSS / SASS | .scss, .sass |
Nested rules via tree-sitter-scss |
| Markdown | .md, .markdown |
Section and block structure via tree-sitter-md |
| Language | Extensions | Implementation Details |
|---|---|---|
| JSON | .json |
Structural parsing via tree-sitter-json |
| YAML | .yaml, .yml |
Structural parsing via tree-sitter-yaml |
| TOML | .toml |
Config parsing via tree-sitter-toml-ng |
| XML | .xml, .svg, .plist, .pom |
Tree structure via tree-sitter-xml |
| HCL / Terraform | .tf, .hcl, .tfvars |
IaC parsing via tree-sitter-hcl |
| Format | Extension | Implementation Details |
|---|---|---|
| Vue SFC | .vue |
Script extraction with language-aware parsing |
Semfora Engine includes semantic duplicate detection that identifies structurally similar code while filtering expected boilerplate.
| Language | Patterns | Status |
|---|---|---|
| JavaScript / TypeScript | 19 | Full support |
| Rust | 13 | Full support |
| C# | 18 | Full support |
| Python | 0 | Planned |
| Go | 0 | Planned |
| Java | 0 | Planned |
| C / C++ | 0 | Planned |
| Assembly | N/A | Structural only |
| Language | Planned Patterns | Priority |
|---|---|---|
| Python | pytest fixtures, dataclasses, FastAPI routes, Pydantic models | High |
| Go | HTTP handlers, middleware, error wrapping | High |
| Java | Spring controllers, Lombok, DTOs, JPA entities | High |
| C / C++ | RAII wrappers, copy/move boilerplate, driver init blocks | High |
| Kotlin | Data classes, coroutines, Ktor routing | High |
| Makefile | Repeated build targets, recursive includes | Medium |
Prioritized by enterprise relevance, embedded systems reach, and large-repo payoff.
- C#
- HCL / Terraform
- JavaScript / TypeScript
- Rust
- Core C / C++
Critical for:
- Embedded systems
- Emulators
- Operating systems
- Retro-console SDKs (KallistiOS, SDL ports)
| Focus | Details |
|---|---|
| Assembly Integration | SH-4, ARM, x86 inline asm correlation |
| Driver Patterns | IRQ handlers, register maps, init/shutdown |
| Build Graphs | Makefile + CMake cross-analysis |
| Dreamcast (KOS) | Boot sequence, main, subsystem init |
| Item | Details |
|---|---|
| Parser | tree-sitter-kotlin-ng |
| Targets | Coroutines, sealed classes, Android + server |
| Item | Details |
|---|---|
| Parser | tree-sitter-swift |
| Targets | Protocols, SwiftUI, async/await |
| Item | Details |
|---|---|
| Parser | tree-sitter-php |
| Targets | Laravel, WordPress |
| Language | Extensions | Mode |
|---|---|---|
| Dockerfile | Dockerfile |
Structural |
| PowerShell | .ps1 |
Structural |
| Linker scripts | .ld |
Structural |
| Item | Details |
|---|---|
| Parser | tree-sitter-odin (or custom) |
| Targets | package main + proc main(), package init, game libs |
| Format | Extensions | Reason |
|---|---|---|
| Jest Snapshots | .shot |
Test artifacts |
| MDX | .mdx |
Hybrid JSX + Markdown |
| AsciiDoc | .adoc |
Docs-only |
| Protocol Buffers | .proto |
Tree-sitter version mismatch |
| Scala | .scala |
Low demand vs complexity |
| Elixir | .ex, .exs |
Low enterprise priority |
src/
├── main.rs # CLI entry point (semfora-engine binary)
├── cli.rs # CLI argument definitions
├── lib.rs # Library exports
├── lang.rs # Language detection from file extensions
├── extract.rs # Main extraction orchestration
├── schema.rs # SemanticSummary output schema
├── toon.rs # TOON format encoding
├── risk.rs # Behavioral risk calculation
├── error.rs # Error types and exit codes
├── cache.rs # Cache management and querying
├── shard.rs # Sharded index generation
├── detectors/ # Language-specific extractors
│ ├── javascript/ # JS/TS with framework support
│ │ ├── core.rs # Core JS/TS extraction
│ │ └── frameworks/ # React, Next.js, Express, Angular, Vue
│ ├── rust.rs
│ ├── python.rs
│ ├── go.rs
│ ├── java.rs
│ ├── kotlin.rs
│ ├── shell.rs
│ ├── gradle.rs
│ ├── c_family.rs
│ ├── markup.rs
│ ├── config.rs
│ ├── grammar.rs # AST node mappings per language
│ └── generic.rs # Generic extraction using grammars
├── mcp_server/ # MCP server (semfora-engine-server binary)
│ ├── mod.rs # MCP tool handlers
│ └── bin.rs # Server entry point
└── socket_server/ # WebSocket daemon (semfora-daemon binary)
├── mod.rs # Server architecture
├── bin.rs # Daemon entry point
├── connection.rs # Client connection handling
├── protocol.rs # Message types
└── repo_registry.rs # Multi-repo context management
- Add tree-sitter grammar to
Cargo.toml - Add
Langvariant inlang.rswith extension mapping - Add
LangGrammarindetectors/grammar.rswith AST node mappings - (Optional) Create dedicated detector in
detectors/for special features - Wire up in
extract.rsdispatcher
| Document | Description |
|---|---|
| Quick Start | Get up and running in 5 minutes |
| CLI Reference | Complete CLI usage, subcommands, and examples |
| Features | Incremental indexing, layered indexes, risk assessment |
| MCP Tools Reference | All MCP tools for AI agent integration |
| MCP Workflows | Common MCP usage patterns |
| WebSocket Daemon | Real-time updates, protocol, and query methods |
| Adding Languages | Guide for adding new language support |
| Architecture | Implementation details and design |