Skip to content
/ refio Public

Refio – open source, local-first coding companion for IntelliJ

License

Notifications You must be signed in to change notification settings

shadoq/refio

Repository files navigation

Refio

Refio – open source, local-first coding companion for IntelliJ IDEA

Table of Contents


Project Overview

Refio is a local-first AI assistant packaged as an IntelliJ plugin written entirely in Kotlin. It supports three execution modes: Chat (conversation with project context), Plan (read-only analysis with tool usage), and Agent (full read/write access with automatic execution).

Design Philosophy

Unlike tools that send entire codebases to LLMs, Refio uses selective context injection through RAG (Retrieval-Augmented Generation) and comprehensive code analysis. This approach:

  • Reduces API costs by 50-70%
  • Provides faster responses
  • Works with smaller context windows (local models)
  • Maintains privacy (no-egress mode available)

Experimental Nature

This project demonstrates that AI coding agents can successfully build other coding agents. Refio serves as both a practical coding assistant and a proof-of-concept for AI-assisted development tool creation.


Key Features

Feature Description
Three Execution Modes Chat, Plan, and Agent with mode-appropriate tools
Enhanced AgentTurnLoop AgentTrunLoop execution + auto-compaction, prompt caching, parallel tools (ADR-0028)
18 Context Providers @file, @folder, @codebase, @grep, @url, @docs, @commit, and more
RAG Indexing Automatic project indexing with language-specific analyzers
10 Tools 5 read-only + 5 write tools with security layers
6 LLM Adapters Ollama, OpenAI, Anthropic, Gemini, OpenRouter, LM Studio
MCP Protocol Full Model Context Protocol with 16 built-in presets
Subagents System Specialized agents with custom prompts and tool permissions
Performance Optimizations Token estimation, retry logic, working memory integration
No-Egress Mode Block cloud providers, use only local models
Native UI IntelliJ Swing components, no webview

How Refio Works

Architecture Overview

┌─────────────────────────────────────────────────────────────────────────┐
│  UI Layer (IntelliJ Swing)                                              │
│  RefioToolWindowFactory → ChatView → PromptInputPanel                   │
├─────────────────────────────────────────────────────────────────────────┤
│  Service Layer (Project-Scoped)                                         │
│  SessionManager (facade)                                                │
│  ├── SessionStateManager (11 StateFlows for reactive UI)                │
│  ├── SessionLifecycleService (session create/switch/load)               │
│  ├── MessageDispatcher (CHAT vs PLAN/AGENT routing)                     │
│  └── SubtaskTracker (subtask lifecycle management)                      │
├─────────────────────────────────────────────────────────────────────────┤
│  Execution Layer                                                        │
│  ├── CHAT mode → WorkflowOrchestrator → ChatExecutor                    │
│  └── PLAN/AGENT mode → AgentTurnLoop (self-directing tool loop)         │
├─────────────────────────────────────────────────────────────────────────┤
│  Core Layer (In-Process API)                                            │
│  CoreApiRouter (facade + 9 domain routers)                              │
│  ├── ChatRouter, TaskRouter, SubtaskRouter                              │
│  ├── RagRouter, ToolRouter, PromptsRouter                               │
│  └── ContextService (dynamic context building)                          │
├─────────────────────────────────────────────────────────────────────────┤
│  Infrastructure Layer                                                   │
│  ├── LLMClient (unified) → 6 provider adapters                          │
│  ├── ToolRegistry → 10 tools with security layers                       │
│  ├── MCPManager → MCP server lifecycle (STDIO/HTTP)                     │
│  ├── EmbeddingsService → Ollama/OpenAI embeddings                       │
│  └── DatabaseFactory → SQLite (WAL) + Exposed ORM                       │
└─────────────────────────────────────────────────────────────────────────┘

Message Flow

User Input
    ↓
SessionManager.sendMessage()
    ↓
┌─────────────────────────────────────────────────────────────────────────┐
│ CHAT mode                         │ PLAN/AGENT mode                     │
│                                   │                                     │
│ WorkflowOrchestrator              │ AgentTurnLoop.runTurn()             │
│     ↓                             │     ↓                               │
│ IntentRouter                      │ 1. Save user message                │
│     ↓                             │ 2. Build prompt + tool descriptions │
│ ChatExecutor                      │ 3. Call LLM                         │
│     ↓                             │ 4. If tool calls:                   │
│ ChatService.chat()                │    - Create subtasks                │
│     ↓                             │    - Execute tools                  │
│ LLMClient.complete()              │    - Summarize results              │
│     ↓                             │    - Continue loop                  │
│ Response                          │ 5. Save final response              │
└─────────────────────────────────────────────────────────────────────────┘
    ↓
MessageDispatcher.loadMessages()
    ↓
UI Update via StateFlow

Execution Modes

Chat Mode

Direct LLM conversation with full project context. No tool execution.

  • Uses ChatExecutor → ChatService
  • Streaming responses
  • Context includes: project analysis, conversation history, RAG fragments, @mentions

Plan Mode

Read-only analysis with tool usage for codebase exploration.

  • Uses AgentTurnLoop with READ_ONLY tools only
  • Model self-directs tool usage (read_file, grep_search, file_search, etc.)
  • Each tool call creates a tracked subtask
  • Ideal for: code review, architecture analysis, bug investigation

Agent Mode

Full read/write access with automatic execution.

  • Uses AgentTurnLoop with ALL tools
  • Model can: create files, edit code, run multi-file edits
  • Subtask lifecycle: PENDING → RUNNING → SUCCESS/FAILED
  • File snapshots before write operations (rollback support)
  • Safety: loop detection, error rate monitoring, max iterations

Performance Optimizations (ADR-0028)

AgentTurnLoop includes production-grade enhancements:

Optimization Benefit
Auto-Compaction Prevents context overflow by summarizing old messages at 80-85% capacity
Prompt Caching Caches static prompts for 5min, reducing construction overhead
Parallel Tools Executes READ_ONLY tools concurrently (~2-3x faster for multi-tool calls)
Retry Logic Exponential backoff for rate limits/timeouts (1s → 2s → 4s)
Token Estimation Pre-flight counting prevents unexpected API errors
Working Memory Auto-extracts knowledge from tool results for context continuity

Configuration: Mode-specific settings (PLAN: 15 iterations, AGENT: 25 iterations)


Context System

Built-in Context Providers

Provider Type Description
@file SUBMENU File picker with search
@folder SUBMENU Directory structure browser
@current NORMAL Currently active editor file
@recent SUBMENU Recently edited files (15-file history)
@open_files NORMAL All open editor tabs
@clipboard NORMAL System clipboard content
@terminal NORMAL Recent terminal output (max 200 lines)
@problems NORMAL Compilation errors/warnings
@diff NORMAL Git uncommitted changes
@codebase QUERY Semantic search via RAG embeddings
@grep QUERY Regex search across project
@url QUERY Fetch web content (100KB max)
@commit QUERY Git commit details by hash
@docs SUBMENU Documentation search with semantic ranking

Provider Types

  • NORMAL: No user input required (clipboard, current, terminal, problems, diff)
  • QUERY: Requires search term (@codebase:auth, @grep:TODO, @url:https://...)
  • SUBMENU: Interactive selection (@file, @folder, @recent, @docs)

Context Building Flow

User Input + @mentions
    ↓
ContextService.buildProjectContext()
    ├── 1. Load cached project analysis (architecture, dependencies)
    ├── 2. Extract conversation history (with summarization)
    ├── 3. Build subtask summaries (completed steps)
    ├── 4. Load RAG fragments (code + docs)
    ├── 5. Load MCP resources
    ├── 6. Resolve @mentions via ContextProviderRegistry
    └── 7. Combine into ProjectContextDTO
    ↓
Token budgeting (section limits, ~28K tokens default)
    ↓
Final prompt sent to LLM

RAG System

Indexing Pipeline

Project Files
    ↓
RagIndexingService.indexProject() [background, at startup]
    ├── Scan files (40+ extensions: .kt, .java, .py, .ts, .js, etc.)
    ├── Compute SHA-256 checksum (incremental detection)
    └── Classify: NEW / MODIFIED / UNCHANGED
    ↓
FileAnalyzerService.analyze()
    ├── Language detection
    ├── AST-like parsing (regex-based)
    └── Extract: classes, functions, imports, annotations
    ↓
ChunkingStrategy.createChunks()
    ├── SemanticChunkingStrategy (structure-aware)
    │   ├── Full-file chunks
    │   ├── Class-level chunks
    │   └── Function-level chunks
    └── DefaultChunkingStrategy (line-based fallback)
    ↓
EmbeddingsService.generateBatch()
    ├── Ollama: nomic-embed-text (768 dims)
    └── OpenAI: text-embedding-3-small (1536 dims)
    ↓
SQLite Storage
    ├── IndexFilesTable (metadata + checksum)
    ├── IndexChunksTable (content + positions)
    └── EmbeddingsTable (vector BLOB, little-endian float32)

Search Pipeline

Query: "authentication logic"
    ↓
EmbeddingProvider.generateEmbedding(query)
    ↓
RagRepository.getEmbeddings(projectRoot)
    ↓
Cosine Similarity Calculation
    cos(q, e) = (q · e) / (||q|| × ||e||)
    ↓
Filter: similarity >= threshold (default 0.5)
    ↓
Sort by similarity, take topK (default 5)
    ↓
Optional: Hybrid search (70% semantic + 30% keyword)
    ↓
RagSearchResult[]

Language Analyzers

Language Analyzer Extracts
Kotlin KotlinLanguageAnalyzer Classes, objects, functions, data classes, coroutines
Java JavaLanguageAnalyzer Classes, interfaces, methods, annotations (Spring)
Python PythonLanguageAnalyzer Classes, functions, decorators, type hints
TypeScript TypeScriptLanguageAnalyzer Classes, interfaces, functions, React components
HTML HtmlLanguageAnalyzer Structure, scripts, styles

Tools

READ_ONLY Tools (5)

Tool Parameters Description
read_file path Read file content (2MB max)
read_directory path, recursive, max_depth List directory tree
file_search pattern, path, offset, limit Glob pattern search
grep_search pattern, path, case_sensitive Regex content search
view_diff file1, file2 OR content2 Line-by-line comparison

WRITE Tools (5)

Tool Parameters Description Cost
create_new_file path, content Create file with parent dirs Free
code_editing path, old_string, new_string, replace_all Search-and-replace Free
multi_edit edits[] Atomic multi-file edit Free
multi_line_editor path, edit_description LLM identifies line ranges ~$0.02
advance_code_editing path, edit_description Full file regeneration ~$0.06
run_terminal_command command Shell execution (DISABLED) Free

Tool Availability by Mode

Mode READ_ONLY WRITE
CHAT
PLAN
AGENT

Subagents

Specialized AI assistants invoked with !agent-name prefix.

Built-in Agents

Agent Model Purpose
security-reviewer weak Security audits, OWASP vulnerabilities
code-reviewer default Code quality, patterns, bugs

Custom Agents

Create custom agents in:

  • User-level: ~/.refio/agents/my-agent.md
  • Project-level: .refio/agents/my-agent.md
---
name: my-agent
description: Custom agent for specific task
tools: read_file, grep_search
model: default
priority: 5
enabled: true
---

You are a specialized assistant for...
[system prompt content]

Model Aliases

Alias Maps To
inherit Parent conversation model
default ConfigService DEFAULT model
plan ConfigService PLAN model
coding ConfigService CODING model
weak ConfigService WEAK model
sonnet claude-3-5-sonnet-20241022
opus claude-3-opus-20240229
haiku claude-3-haiku-20240307

MCP Protocol

Full Model Context Protocol implementation with STDIO and HTTP/SSE transports.

16 Built-in Presets

Category Servers
VCS GitHub, GitLab
Databases PostgreSQL, SQLite
Search Brave Search, Exa
Docs Context7
DevOps Sentry, AWS
Storage Google Drive, Filesystem
Development Puppeteer, Sequential Thinking
Collaboration Slack
Other Memory, Custom API

Configuration

# .refio/config.yaml
mcp:
  servers:
    - id: "github"
      type: "STDIO"
      command: "npx"
      args: ["-y", "@modelcontextprotocol/server-github"]
      accessMode: "READ"
      enabled: true
      env:
        - name: "GITHUB_TOKEN"
          value: "${GITHUB_TOKEN}"
          isSecret: true

LLM Providers

Provider Models Features
Ollama Local models Free, JSON mode, local privacy
OpenAI GPT-4o, GPT-4o-mini, o1, o3, GPT-5 Responses API, reasoning models
Anthropic Claude 3.5/3.7, Opus 4.1 Thinking mode, top-level system
Gemini 2.5 Flash/Pro system_instruction, thinkingConfig
OpenRouter All providers Unified gateway, dynamic pricing
LM Studio Local models OpenAI-compatible, free

Pricing (per 1M tokens, USD)

Model Input Output
gpt-4o-mini $0.15 $0.60
gpt-4o $2.50 $10.00
claude-3-5-sonnet $3.00 $15.00
claude-opus-4-1 $15.00 $75.00
Ollama (local) $0.00 $0.00

Tech Stack

  • Language/Platform: Kotlin 1.9.25, JVM target 17, IntelliJ Platform 2024.1.7 (IC)
  • UI: Native IntelliJ Swing components (no webview)
  • Core/Transport: In-process CoreApiRouter (optional Ktor server for CLI, planned v1.1+)
  • LLM/HTTP: Ktor Client 2.3.7 with 6 provider adapters
  • Database: SQLite (WAL) via Exposed 0.46.0 + sqlite-jdbc 3.44.1.0
  • Serialization: Gson 2.10.1, kotlinx-serialization-json 1.6.2, KAML 0.55.0
  • Markdown: commonmark 0.21.0
  • Logging: kotlin-logging + logback 1.4.14
  • Build: Gradle IntelliJ Plugin 1.17.4

Getting Started

Prerequisites

  1. JDK 17 and IntelliJ IDEA 2024.x
  2. Ollama from https://ollama.com/
  3. Required models:
    ollama pull nomic-embed-text:latest  # Embeddings
    ollama pull qwen2.5-coder:14b        # Coding

Installation

# Clone repository
git clone https://github.com/shadoq/refio.git && cd refio

# Run in sandbox IDE
cd agent/plugin
./gradlew runIde          # Linux/macOS
.\gradlew.bat runIde      # Windows

# Or build plugin ZIP
./gradlew buildPlugin     # Output: build/distributions/refio-<version>.zip

First Steps

  1. Open the Refio tool window (View → Tool Windows → Refio)
  2. Select execution mode (Chat/Plan/Agent)
  3. Choose LLM model from dropdown
  4. Type your prompt with optional @mentions
  5. Press Enter or click Send

Configuration

File Locations

File Location Purpose
User config ~/.refio/config.yaml Personal settings, API keys
Project config <project>/.refio/config.yaml Project-specific prompts, MCP servers
Database refio_poc.db (project root) Session data, messages, RAG index
Ignore patterns <project>/.aiignore RAG indexing exclusions

Quick Start Config

# ~/.refio/config.yaml
general:
  formatMarkdown: true
  streamingEnabled: true

providers:
  ollama:
    endpoint: "http://localhost:11434"
  anthropic:
    apiKey: "sk-ant-..."

models:
  defaults:
    chat: "ollama/qwen2.5:7b"
    coding: "ollama/qwen2.5-coder:7b"
    embedding: "ollama/nomic-embed-text"

  visibility:
    "ollama/qwen2.5:7b": true
    "anthropic/claude-3-5-sonnet-20241022": true

See docs/config.md for full configuration reference.


Security

Defense Layers

Layer Protection
PathSandbox All file ops restricted to project root
FileLimits Size limits (2MB), excluded directories (24), extensions (34)
CommandDenylist 58 dangerous patterns blocked (rm -rf, sudo, curl|sh)
ToolPermissions Per-mode access control (PLAN=read-only, AGENT=read-write)
No-Egress Mode Blocks cloud providers, allows only Ollama/LM Studio
Secret Redaction API keys masked in all logs

Known Issues (P0)

Issue Description Mitigation
Symlink Escape PathSandbox can be bypassed via symlinks Detection + logging in place
Denylist Bypass Terminal command patterns may be circumvented Tool DISABLED by default

Project Status

Version: 0.0.1 (Full Kotlin plugin + embedded core)

CLI: Planned for v1.1+

Test Coverage: 0% (migration in progress)

Roadmap

v0.1 (Current)

  • Complete architecture cleanup
  • Fix P0 security issues
  • Add Kotlin tests
  • RAG file watcher

v0.2 (Planned)

  • CLI parity (Ktor wrapper)
  • Export/import flows
  • Better context preview
  • Cost controls

Available Scripts

cd agent/plugin

# Development
./gradlew runIde              # Run in sandbox IDE
./gradlew buildPlugin         # Build ZIP distribution

# Quality
./gradlew detekt              # Static analysis
./gradlew ktlintCheck         # Lint check
./gradlew ktlintFormat        # Auto-format

# Testing
./gradlew test                # Run all tests

Repository Structure

src/main/kotlin/pl/jclab/refio/
├── core/                     # Embedded core (no IDE dependencies)
│   ├── api/                  # Router layer (9 domain routers)
│   ├── context/              # Context providers + MCP
│   ├── db/                   # Database layer (Exposed ORM)
│   ├── llm/                  # LLM integration (6 adapters)
│   ├── services/             # Core services (RAG, context, analysis)
│   ├── subagents/            # Subagent system
│   ├── tools/                # Tool system (10 implementations)
│   └── prompts/              # Prompt templates
├── services/                 # Plugin services (project-scoped)
│   ├── session/              # SessionManager (6 components)
│   └── rag/                  # Background indexing
└── ui/                       # IntelliJ UI components
    ├── toolwindow/           # Tool window factory
    ├── components/           # Chat, toolbar, autocomplete
    └── settings/             # 12+ settings panels

License

MIT License. See LICENSE.

About

Refio – open source, local-first coding companion for IntelliJ

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages