Refio – open source, local-first coding companion for IntelliJ IDEA
- Project Overview
- Key Features
- How Refio Works
- Execution Modes
- Context System
- RAG System
- Tools
- Subagents
- MCP Protocol
- LLM Providers
- Tech Stack
- Getting Started
- Configuration
- Security
- Project Status
- License
Refio is a local-first AI assistant packaged as an IntelliJ plugin written entirely in Kotlin. It supports three execution modes: Chat (conversation with project context), Plan (read-only analysis with tool usage), and Agent (full read/write access with automatic execution).
Unlike tools that send entire codebases to LLMs, Refio uses selective context injection through RAG (Retrieval-Augmented Generation) and comprehensive code analysis. This approach:
- Reduces API costs by 50-70%
- Provides faster responses
- Works with smaller context windows (local models)
- Maintains privacy (no-egress mode available)
This project demonstrates that AI coding agents can successfully build other coding agents. Refio serves as both a practical coding assistant and a proof-of-concept for AI-assisted development tool creation.
| Feature | Description |
|---|---|
| Three Execution Modes | Chat, Plan, and Agent with mode-appropriate tools |
| Enhanced AgentTurnLoop | AgentTrunLoop execution + auto-compaction, prompt caching, parallel tools (ADR-0028) |
| 18 Context Providers | @file, @folder, @codebase, @grep, @url, @docs, @commit, and more |
| RAG Indexing | Automatic project indexing with language-specific analyzers |
| 10 Tools | 5 read-only + 5 write tools with security layers |
| 6 LLM Adapters | Ollama, OpenAI, Anthropic, Gemini, OpenRouter, LM Studio |
| MCP Protocol | Full Model Context Protocol with 16 built-in presets |
| Subagents System | Specialized agents with custom prompts and tool permissions |
| Performance Optimizations | Token estimation, retry logic, working memory integration |
| No-Egress Mode | Block cloud providers, use only local models |
| Native UI | IntelliJ Swing components, no webview |
┌─────────────────────────────────────────────────────────────────────────┐
│ UI Layer (IntelliJ Swing) │
│ RefioToolWindowFactory → ChatView → PromptInputPanel │
├─────────────────────────────────────────────────────────────────────────┤
│ Service Layer (Project-Scoped) │
│ SessionManager (facade) │
│ ├── SessionStateManager (11 StateFlows for reactive UI) │
│ ├── SessionLifecycleService (session create/switch/load) │
│ ├── MessageDispatcher (CHAT vs PLAN/AGENT routing) │
│ └── SubtaskTracker (subtask lifecycle management) │
├─────────────────────────────────────────────────────────────────────────┤
│ Execution Layer │
│ ├── CHAT mode → WorkflowOrchestrator → ChatExecutor │
│ └── PLAN/AGENT mode → AgentTurnLoop (self-directing tool loop) │
├─────────────────────────────────────────────────────────────────────────┤
│ Core Layer (In-Process API) │
│ CoreApiRouter (facade + 9 domain routers) │
│ ├── ChatRouter, TaskRouter, SubtaskRouter │
│ ├── RagRouter, ToolRouter, PromptsRouter │
│ └── ContextService (dynamic context building) │
├─────────────────────────────────────────────────────────────────────────┤
│ Infrastructure Layer │
│ ├── LLMClient (unified) → 6 provider adapters │
│ ├── ToolRegistry → 10 tools with security layers │
│ ├── MCPManager → MCP server lifecycle (STDIO/HTTP) │
│ ├── EmbeddingsService → Ollama/OpenAI embeddings │
│ └── DatabaseFactory → SQLite (WAL) + Exposed ORM │
└─────────────────────────────────────────────────────────────────────────┘
User Input
↓
SessionManager.sendMessage()
↓
┌─────────────────────────────────────────────────────────────────────────┐
│ CHAT mode │ PLAN/AGENT mode │
│ │ │
│ WorkflowOrchestrator │ AgentTurnLoop.runTurn() │
│ ↓ │ ↓ │
│ IntentRouter │ 1. Save user message │
│ ↓ │ 2. Build prompt + tool descriptions │
│ ChatExecutor │ 3. Call LLM │
│ ↓ │ 4. If tool calls: │
│ ChatService.chat() │ - Create subtasks │
│ ↓ │ - Execute tools │
│ LLMClient.complete() │ - Summarize results │
│ ↓ │ - Continue loop │
│ Response │ 5. Save final response │
└─────────────────────────────────────────────────────────────────────────┘
↓
MessageDispatcher.loadMessages()
↓
UI Update via StateFlow
Direct LLM conversation with full project context. No tool execution.
- Uses ChatExecutor → ChatService
- Streaming responses
- Context includes: project analysis, conversation history, RAG fragments, @mentions
Read-only analysis with tool usage for codebase exploration.
- Uses AgentTurnLoop with READ_ONLY tools only
- Model self-directs tool usage (read_file, grep_search, file_search, etc.)
- Each tool call creates a tracked subtask
- Ideal for: code review, architecture analysis, bug investigation
Full read/write access with automatic execution.
- Uses AgentTurnLoop with ALL tools
- Model can: create files, edit code, run multi-file edits
- Subtask lifecycle: PENDING → RUNNING → SUCCESS/FAILED
- File snapshots before write operations (rollback support)
- Safety: loop detection, error rate monitoring, max iterations
AgentTurnLoop includes production-grade enhancements:
| Optimization | Benefit |
|---|---|
| Auto-Compaction | Prevents context overflow by summarizing old messages at 80-85% capacity |
| Prompt Caching | Caches static prompts for 5min, reducing construction overhead |
| Parallel Tools | Executes READ_ONLY tools concurrently (~2-3x faster for multi-tool calls) |
| Retry Logic | Exponential backoff for rate limits/timeouts (1s → 2s → 4s) |
| Token Estimation | Pre-flight counting prevents unexpected API errors |
| Working Memory | Auto-extracts knowledge from tool results for context continuity |
Configuration: Mode-specific settings (PLAN: 15 iterations, AGENT: 25 iterations)
| Provider | Type | Description |
|---|---|---|
@file |
SUBMENU | File picker with search |
@folder |
SUBMENU | Directory structure browser |
@current |
NORMAL | Currently active editor file |
@recent |
SUBMENU | Recently edited files (15-file history) |
@open_files |
NORMAL | All open editor tabs |
@clipboard |
NORMAL | System clipboard content |
@terminal |
NORMAL | Recent terminal output (max 200 lines) |
@problems |
NORMAL | Compilation errors/warnings |
@diff |
NORMAL | Git uncommitted changes |
@codebase |
QUERY | Semantic search via RAG embeddings |
@grep |
QUERY | Regex search across project |
@url |
QUERY | Fetch web content (100KB max) |
@commit |
QUERY | Git commit details by hash |
@docs |
SUBMENU | Documentation search with semantic ranking |
- NORMAL: No user input required (clipboard, current, terminal, problems, diff)
- QUERY: Requires search term (@codebase:auth, @grep:TODO, @url:https://...)
- SUBMENU: Interactive selection (@file, @folder, @recent, @docs)
User Input + @mentions
↓
ContextService.buildProjectContext()
├── 1. Load cached project analysis (architecture, dependencies)
├── 2. Extract conversation history (with summarization)
├── 3. Build subtask summaries (completed steps)
├── 4. Load RAG fragments (code + docs)
├── 5. Load MCP resources
├── 6. Resolve @mentions via ContextProviderRegistry
└── 7. Combine into ProjectContextDTO
↓
Token budgeting (section limits, ~28K tokens default)
↓
Final prompt sent to LLM
Project Files
↓
RagIndexingService.indexProject() [background, at startup]
├── Scan files (40+ extensions: .kt, .java, .py, .ts, .js, etc.)
├── Compute SHA-256 checksum (incremental detection)
└── Classify: NEW / MODIFIED / UNCHANGED
↓
FileAnalyzerService.analyze()
├── Language detection
├── AST-like parsing (regex-based)
└── Extract: classes, functions, imports, annotations
↓
ChunkingStrategy.createChunks()
├── SemanticChunkingStrategy (structure-aware)
│ ├── Full-file chunks
│ ├── Class-level chunks
│ └── Function-level chunks
└── DefaultChunkingStrategy (line-based fallback)
↓
EmbeddingsService.generateBatch()
├── Ollama: nomic-embed-text (768 dims)
└── OpenAI: text-embedding-3-small (1536 dims)
↓
SQLite Storage
├── IndexFilesTable (metadata + checksum)
├── IndexChunksTable (content + positions)
└── EmbeddingsTable (vector BLOB, little-endian float32)
Query: "authentication logic"
↓
EmbeddingProvider.generateEmbedding(query)
↓
RagRepository.getEmbeddings(projectRoot)
↓
Cosine Similarity Calculation
cos(q, e) = (q · e) / (||q|| × ||e||)
↓
Filter: similarity >= threshold (default 0.5)
↓
Sort by similarity, take topK (default 5)
↓
Optional: Hybrid search (70% semantic + 30% keyword)
↓
RagSearchResult[]
| Language | Analyzer | Extracts |
|---|---|---|
| Kotlin | KotlinLanguageAnalyzer | Classes, objects, functions, data classes, coroutines |
| Java | JavaLanguageAnalyzer | Classes, interfaces, methods, annotations (Spring) |
| Python | PythonLanguageAnalyzer | Classes, functions, decorators, type hints |
| TypeScript | TypeScriptLanguageAnalyzer | Classes, interfaces, functions, React components |
| HTML | HtmlLanguageAnalyzer | Structure, scripts, styles |
| Tool | Parameters | Description |
|---|---|---|
read_file |
path | Read file content (2MB max) |
read_directory |
path, recursive, max_depth | List directory tree |
file_search |
pattern, path, offset, limit | Glob pattern search |
grep_search |
pattern, path, case_sensitive | Regex content search |
view_diff |
file1, file2 OR content2 | Line-by-line comparison |
| Tool | Parameters | Description | Cost |
|---|---|---|---|
create_new_file |
path, content | Create file with parent dirs | Free |
code_editing |
path, old_string, new_string, replace_all | Search-and-replace | Free |
multi_edit |
edits[] | Atomic multi-file edit | Free |
multi_line_editor |
path, edit_description | LLM identifies line ranges | ~$0.02 |
advance_code_editing |
path, edit_description | Full file regeneration | ~$0.06 |
run_terminal_command |
command | Shell execution (DISABLED) | Free |
| Mode | READ_ONLY | WRITE |
|---|---|---|
| CHAT | ❌ | ❌ |
| PLAN | ✅ | ❌ |
| AGENT | ✅ | ✅ |
Specialized AI assistants invoked with !agent-name prefix.
| Agent | Model | Purpose |
|---|---|---|
security-reviewer |
weak | Security audits, OWASP vulnerabilities |
code-reviewer |
default | Code quality, patterns, bugs |
Create custom agents in:
- User-level:
~/.refio/agents/my-agent.md - Project-level:
.refio/agents/my-agent.md
---
name: my-agent
description: Custom agent for specific task
tools: read_file, grep_search
model: default
priority: 5
enabled: true
---
You are a specialized assistant for...
[system prompt content]| Alias | Maps To |
|---|---|
inherit |
Parent conversation model |
default |
ConfigService DEFAULT model |
plan |
ConfigService PLAN model |
coding |
ConfigService CODING model |
weak |
ConfigService WEAK model |
sonnet |
claude-3-5-sonnet-20241022 |
opus |
claude-3-opus-20240229 |
haiku |
claude-3-haiku-20240307 |
Full Model Context Protocol implementation with STDIO and HTTP/SSE transports.
| Category | Servers |
|---|---|
| VCS | GitHub, GitLab |
| Databases | PostgreSQL, SQLite |
| Search | Brave Search, Exa |
| Docs | Context7 |
| DevOps | Sentry, AWS |
| Storage | Google Drive, Filesystem |
| Development | Puppeteer, Sequential Thinking |
| Collaboration | Slack |
| Other | Memory, Custom API |
# .refio/config.yaml
mcp:
servers:
- id: "github"
type: "STDIO"
command: "npx"
args: ["-y", "@modelcontextprotocol/server-github"]
accessMode: "READ"
enabled: true
env:
- name: "GITHUB_TOKEN"
value: "${GITHUB_TOKEN}"
isSecret: true| Provider | Models | Features |
|---|---|---|
| Ollama | Local models | Free, JSON mode, local privacy |
| OpenAI | GPT-4o, GPT-4o-mini, o1, o3, GPT-5 | Responses API, reasoning models |
| Anthropic | Claude 3.5/3.7, Opus 4.1 | Thinking mode, top-level system |
| Gemini | 2.5 Flash/Pro | system_instruction, thinkingConfig |
| OpenRouter | All providers | Unified gateway, dynamic pricing |
| LM Studio | Local models | OpenAI-compatible, free |
| Model | Input | Output |
|---|---|---|
| gpt-4o-mini | $0.15 | $0.60 |
| gpt-4o | $2.50 | $10.00 |
| claude-3-5-sonnet | $3.00 | $15.00 |
| claude-opus-4-1 | $15.00 | $75.00 |
| Ollama (local) | $0.00 | $0.00 |
- Language/Platform: Kotlin 1.9.25, JVM target 17, IntelliJ Platform 2024.1.7 (IC)
- UI: Native IntelliJ Swing components (no webview)
- Core/Transport: In-process CoreApiRouter (optional Ktor server for CLI, planned v1.1+)
- LLM/HTTP: Ktor Client 2.3.7 with 6 provider adapters
- Database: SQLite (WAL) via Exposed 0.46.0 + sqlite-jdbc 3.44.1.0
- Serialization: Gson 2.10.1, kotlinx-serialization-json 1.6.2, KAML 0.55.0
- Markdown: commonmark 0.21.0
- Logging: kotlin-logging + logback 1.4.14
- Build: Gradle IntelliJ Plugin 1.17.4
- JDK 17 and IntelliJ IDEA 2024.x
- Ollama from https://ollama.com/
- Required models:
ollama pull nomic-embed-text:latest # Embeddings ollama pull qwen2.5-coder:14b # Coding
# Clone repository
git clone https://github.com/shadoq/refio.git && cd refio
# Run in sandbox IDE
cd agent/plugin
./gradlew runIde # Linux/macOS
.\gradlew.bat runIde # Windows
# Or build plugin ZIP
./gradlew buildPlugin # Output: build/distributions/refio-<version>.zip- Open the Refio tool window (View → Tool Windows → Refio)
- Select execution mode (Chat/Plan/Agent)
- Choose LLM model from dropdown
- Type your prompt with optional @mentions
- Press Enter or click Send
| File | Location | Purpose |
|---|---|---|
| User config | ~/.refio/config.yaml |
Personal settings, API keys |
| Project config | <project>/.refio/config.yaml |
Project-specific prompts, MCP servers |
| Database | refio_poc.db (project root) |
Session data, messages, RAG index |
| Ignore patterns | <project>/.aiignore |
RAG indexing exclusions |
# ~/.refio/config.yaml
general:
formatMarkdown: true
streamingEnabled: true
providers:
ollama:
endpoint: "http://localhost:11434"
anthropic:
apiKey: "sk-ant-..."
models:
defaults:
chat: "ollama/qwen2.5:7b"
coding: "ollama/qwen2.5-coder:7b"
embedding: "ollama/nomic-embed-text"
visibility:
"ollama/qwen2.5:7b": true
"anthropic/claude-3-5-sonnet-20241022": trueSee docs/config.md for full configuration reference.
| Layer | Protection |
|---|---|
| PathSandbox | All file ops restricted to project root |
| FileLimits | Size limits (2MB), excluded directories (24), extensions (34) |
| CommandDenylist | 58 dangerous patterns blocked (rm -rf, sudo, curl|sh) |
| ToolPermissions | Per-mode access control (PLAN=read-only, AGENT=read-write) |
| No-Egress Mode | Blocks cloud providers, allows only Ollama/LM Studio |
| Secret Redaction | API keys masked in all logs |
| Issue | Description | Mitigation |
|---|---|---|
| Symlink Escape | PathSandbox can be bypassed via symlinks | Detection + logging in place |
| Denylist Bypass | Terminal command patterns may be circumvented | Tool DISABLED by default |
Version: 0.0.1 (Full Kotlin plugin + embedded core)
CLI: Planned for v1.1+
Test Coverage: 0% (migration in progress)
v0.1 (Current)
- Complete architecture cleanup
- Fix P0 security issues
- Add Kotlin tests
- RAG file watcher
v0.2 (Planned)
- CLI parity (Ktor wrapper)
- Export/import flows
- Better context preview
- Cost controls
cd agent/plugin
# Development
./gradlew runIde # Run in sandbox IDE
./gradlew buildPlugin # Build ZIP distribution
# Quality
./gradlew detekt # Static analysis
./gradlew ktlintCheck # Lint check
./gradlew ktlintFormat # Auto-format
# Testing
./gradlew test # Run all testssrc/main/kotlin/pl/jclab/refio/
├── core/ # Embedded core (no IDE dependencies)
│ ├── api/ # Router layer (9 domain routers)
│ ├── context/ # Context providers + MCP
│ ├── db/ # Database layer (Exposed ORM)
│ ├── llm/ # LLM integration (6 adapters)
│ ├── services/ # Core services (RAG, context, analysis)
│ ├── subagents/ # Subagent system
│ ├── tools/ # Tool system (10 implementations)
│ └── prompts/ # Prompt templates
├── services/ # Plugin services (project-scoped)
│ ├── session/ # SessionManager (6 components)
│ └── rag/ # Background indexing
└── ui/ # IntelliJ UI components
├── toolwindow/ # Tool window factory
├── components/ # Chat, toolbar, autocomplete
└── settings/ # 12+ settings panels
MIT License. See LICENSE.