Refio

Refio – open source, local-first coding companion for IntelliJ IDEA

Project Overview

Refio is a local-first AI assistant packaged as an IntelliJ plugin written entirely in Kotlin. It supports three execution modes: Chat (conversation with project context), Plan (read-only analysis with tool usage), and Agent (full read/write access with automatic execution).

Design Philosophy

Unlike tools that send entire codebases to LLMs, Refio uses selective context injection through RAG (Retrieval-Augmented Generation) and comprehensive code analysis. This approach:

Reduces API costs by 50-70%
Provides faster responses
Works with smaller context windows (local models)
Maintains privacy (no-egress mode available)

Experimental Nature

This project demonstrates that AI coding agents can successfully build other coding agents. Refio serves as both a practical coding assistant and a proof-of-concept for AI-assisted development tool creation.

Key Features

Feature	Description
Three Execution Modes	Chat, Plan, and Agent with mode-appropriate tools
Enhanced AgentTurnLoop	AgentTrunLoop execution + auto-compaction, prompt caching, parallel tools (ADR-0028)
18 Context Providers	@file, @folder, @codebase, @grep, @url, @docs, @commit, and more
RAG Indexing	Automatic project indexing with language-specific analyzers
10 Tools	5 read-only + 5 write tools with security layers
6 LLM Adapters	Ollama, OpenAI, Anthropic, Gemini, OpenRouter, LM Studio
MCP Protocol	Full Model Context Protocol with 16 built-in presets
Subagents System	Specialized agents with custom prompts and tool permissions
Performance Optimizations	Token estimation, retry logic, working memory integration
No-Egress Mode	Block cloud providers, use only local models
Native UI	IntelliJ Swing components, no webview

How Refio Works

Architecture Overview

┌─────────────────────────────────────────────────────────────────────────┐
│  UI Layer (IntelliJ Swing)                                              │
│  RefioToolWindowFactory → ChatView → PromptInputPanel                   │
├─────────────────────────────────────────────────────────────────────────┤
│  Service Layer (Project-Scoped)                                         │
│  SessionManager (facade)                                                │
│  ├── SessionStateManager (11 StateFlows for reactive UI)                │
│  ├── SessionLifecycleService (session create/switch/load)               │
│  ├── MessageDispatcher (CHAT vs PLAN/AGENT routing)                     │
│  └── SubtaskTracker (subtask lifecycle management)                      │
├─────────────────────────────────────────────────────────────────────────┤
│  Execution Layer                                                        │
│  ├── CHAT mode → WorkflowOrchestrator → ChatExecutor                    │
│  └── PLAN/AGENT mode → AgentTurnLoop (self-directing tool loop)         │
├─────────────────────────────────────────────────────────────────────────┤
│  Core Layer (In-Process API)                                            │
│  CoreApiRouter (facade + 9 domain routers)                              │
│  ├── ChatRouter, TaskRouter, SubtaskRouter                              │
│  ├── RagRouter, ToolRouter, PromptsRouter                               │
│  └── ContextService (dynamic context building)                          │
├─────────────────────────────────────────────────────────────────────────┤
│  Infrastructure Layer                                                   │
│  ├── LLMClient (unified) → 6 provider adapters                          │
│  ├── ToolRegistry → 10 tools with security layers                       │
│  ├── MCPManager → MCP server lifecycle (STDIO/HTTP)                     │
│  ├── EmbeddingsService → Ollama/OpenAI embeddings                       │
│  └── DatabaseFactory → SQLite (WAL) + Exposed ORM                       │
└─────────────────────────────────────────────────────────────────────────┘

Message Flow

User Input
    ↓
SessionManager.sendMessage()
    ↓
┌─────────────────────────────────────────────────────────────────────────┐
│ CHAT mode                         │ PLAN/AGENT mode                     │
│                                   │                                     │
│ WorkflowOrchestrator              │ AgentTurnLoop.runTurn()             │
│     ↓                             │     ↓                               │
│ IntentRouter                      │ 1. Save user message                │
│     ↓                             │ 2. Build prompt + tool descriptions │
│ ChatExecutor                      │ 3. Call LLM                         │
│     ↓                             │ 4. If tool calls:                   │
│ ChatService.chat()                │    - Create subtasks                │
│     ↓                             │    - Execute tools                  │
│ LLMClient.complete()              │    - Summarize results              │
│     ↓                             │    - Continue loop                  │
│ Response                          │ 5. Save final response              │
└─────────────────────────────────────────────────────────────────────────┘
    ↓
MessageDispatcher.loadMessages()
    ↓
UI Update via StateFlow

Execution Modes

Chat Mode

Direct LLM conversation with full project context. No tool execution.

Uses ChatExecutor → ChatService
Streaming responses
Context includes: project analysis, conversation history, RAG fragments, @mentions

Plan Mode

Read-only analysis with tool usage for codebase exploration.

Uses AgentTurnLoop with READ_ONLY tools only
Model self-directs tool usage (read_file, grep_search, file_search, etc.)
Each tool call creates a tracked subtask
Ideal for: code review, architecture analysis, bug investigation

Agent Mode

Full read/write access with automatic execution.

Uses AgentTurnLoop with ALL tools
Model can: create files, edit code, run multi-file edits
Subtask lifecycle: PENDING → RUNNING → SUCCESS/FAILED
File snapshots before write operations (rollback support)
Safety: loop detection, error rate monitoring, max iterations

Performance Optimizations (ADR-0028)

AgentTurnLoop includes production-grade enhancements:

Optimization	Benefit
Auto-Compaction	Prevents context overflow by summarizing old messages at 80-85% capacity
Prompt Caching	Caches static prompts for 5min, reducing construction overhead
Parallel Tools	Executes READ_ONLY tools concurrently (~2-3x faster for multi-tool calls)
Retry Logic	Exponential backoff for rate limits/timeouts (1s → 2s → 4s)
Token Estimation	Pre-flight counting prevents unexpected API errors
Working Memory	Auto-extracts knowledge from tool results for context continuity

Configuration: Mode-specific settings (PLAN: 15 iterations, AGENT: 25 iterations)

Context System

Built-in Context Providers

Provider	Type	Description
`@file`	SUBMENU	File picker with search
`@folder`	SUBMENU	Directory structure browser
`@current`	NORMAL	Currently active editor file
`@recent`	SUBMENU	Recently edited files (15-file history)
`@open_files`	NORMAL	All open editor tabs
`@clipboard`	NORMAL	System clipboard content
`@terminal`	NORMAL	Recent terminal output (max 200 lines)
`@problems`	NORMAL	Compilation errors/warnings
`@diff`	NORMAL	Git uncommitted changes
`@codebase`	QUERY	Semantic search via RAG embeddings
`@grep`	QUERY	Regex search across project
`@url`	QUERY	Fetch web content (100KB max)
`@commit`	QUERY	Git commit details by hash
`@docs`	SUBMENU	Documentation search with semantic ranking

Provider Types

NORMAL: No user input required (clipboard, current, terminal, problems, diff)
QUERY: Requires search term (@codebase:auth, @grep:TODO, @url:https://...)
SUBMENU: Interactive selection (@file, @folder, @recent, @docs)

Context Building Flow

User Input + @mentions
    ↓
ContextService.buildProjectContext()
    ├── 1. Load cached project analysis (architecture, dependencies)
    ├── 2. Extract conversation history (with summarization)
    ├── 3. Build subtask summaries (completed steps)
    ├── 4. Load RAG fragments (code + docs)
    ├── 5. Load MCP resources
    ├── 6. Resolve @mentions via ContextProviderRegistry
    └── 7. Combine into ProjectContextDTO
    ↓
Token budgeting (section limits, ~28K tokens default)
    ↓
Final prompt sent to LLM

RAG System

Indexing Pipeline

Project Files
    ↓
RagIndexingService.indexProject() [background, at startup]
    ├── Scan files (40+ extensions: .kt, .java, .py, .ts, .js, etc.)
    ├── Compute SHA-256 checksum (incremental detection)
    └── Classify: NEW / MODIFIED / UNCHANGED
    ↓
FileAnalyzerService.analyze()
    ├── Language detection
    ├── AST-like parsing (regex-based)
    └── Extract: classes, functions, imports, annotations
    ↓
ChunkingStrategy.createChunks()
    ├── SemanticChunkingStrategy (structure-aware)
    │   ├── Full-file chunks
    │   ├── Class-level chunks
    │   └── Function-level chunks
    └── DefaultChunkingStrategy (line-based fallback)
    ↓
EmbeddingsService.generateBatch()
    ├── Ollama: nomic-embed-text (768 dims)
    └── OpenAI: text-embedding-3-small (1536 dims)
    ↓
SQLite Storage
    ├── IndexFilesTable (metadata + checksum)
    ├── IndexChunksTable (content + positions)
    └── EmbeddingsTable (vector BLOB, little-endian float32)

Search Pipeline

Query: "authentication logic"
    ↓
EmbeddingProvider.generateEmbedding(query)
    ↓
RagRepository.getEmbeddings(projectRoot)
    ↓
Cosine Similarity Calculation
    cos(q, e) = (q · e) / (||q|| × ||e||)
    ↓
Filter: similarity >= threshold (default 0.5)
    ↓
Sort by similarity, take topK (default 5)
    ↓
Optional: Hybrid search (70% semantic + 30% keyword)
    ↓
RagSearchResult[]

Language Analyzers

Language	Analyzer	Extracts
Kotlin	KotlinLanguageAnalyzer	Classes, objects, functions, data classes, coroutines
Java	JavaLanguageAnalyzer	Classes, interfaces, methods, annotations (Spring)
Python	PythonLanguageAnalyzer	Classes, functions, decorators, type hints
TypeScript	TypeScriptLanguageAnalyzer	Classes, interfaces, functions, React components
HTML	HtmlLanguageAnalyzer	Structure, scripts, styles

Tools

READ_ONLY Tools (5)

Tool	Parameters	Description
`read_file`	path	Read file content (2MB max)
`read_directory`	path, recursive, max_depth	List directory tree
`file_search`	pattern, path, offset, limit	Glob pattern search
`grep_search`	pattern, path, case_sensitive	Regex content search
`view_diff`	file1, file2 OR content2	Line-by-line comparison

WRITE Tools (5)

Tool	Parameters	Description	Cost
`create_new_file`	path, content	Create file with parent dirs	Free
`code_editing`	path, old_string, new_string, replace_all	Search-and-replace	Free
`multi_edit`	edits[]	Atomic multi-file edit	Free
`multi_line_editor`	path, edit_description	LLM identifies line ranges	~$0.02
`advance_code_editing`	path, edit_description	Full file regeneration	~$0.06
`run_terminal_command`	command	Shell execution (DISABLED)	Free

Tool Availability by Mode

Mode	READ_ONLY	WRITE
CHAT	❌	❌
PLAN	✅	❌
AGENT	✅	✅

Subagents

Specialized AI assistants invoked with !agent-name prefix.

Built-in Agents

Agent	Model	Purpose
`security-reviewer`	weak	Security audits, OWASP vulnerabilities
`code-reviewer`	default	Code quality, patterns, bugs

Custom Agents

Create custom agents in:

User-level: ~/.refio/agents/my-agent.md
Project-level: .refio/agents/my-agent.md

---
name: my-agent
description: Custom agent for specific task
tools: read_file, grep_search
model: default
priority: 5
enabled: true
---

You are a specialized assistant for...
[system prompt content]

Model Aliases

Alias	Maps To
`inherit`	Parent conversation model
`default`	ConfigService DEFAULT model
`plan`	ConfigService PLAN model
`coding`	ConfigService CODING model
`weak`	ConfigService WEAK model
`sonnet`	claude-3-5-sonnet-20241022
`opus`	claude-3-opus-20240229
`haiku`	claude-3-haiku-20240307

MCP Protocol

Full Model Context Protocol implementation with STDIO and HTTP/SSE transports.

16 Built-in Presets

Category	Servers
VCS	GitHub, GitLab
Databases	PostgreSQL, SQLite
Search	Brave Search, Exa
Docs	Context7
DevOps	Sentry, AWS
Storage	Google Drive, Filesystem
Development	Puppeteer, Sequential Thinking
Collaboration	Slack
Other	Memory, Custom API

Configuration

# .refio/config.yaml
mcp:
  servers:
    - id: "github"
      type: "STDIO"
      command: "npx"
      args: ["-y", "@modelcontextprotocol/server-github"]
      accessMode: "READ"
      enabled: true
      env:
        - name: "GITHUB_TOKEN"
          value: "${GITHUB_TOKEN}"
          isSecret: true

LLM Providers

Provider	Models	Features
Ollama	Local models	Free, JSON mode, local privacy
OpenAI	GPT-4o, GPT-4o-mini, o1, o3, GPT-5	Responses API, reasoning models
Anthropic	Claude 3.5/3.7, Opus 4.1	Thinking mode, top-level system
Gemini	2.5 Flash/Pro	system_instruction, thinkingConfig
OpenRouter	All providers	Unified gateway, dynamic pricing
LM Studio	Local models	OpenAI-compatible, free

Pricing (per 1M tokens, USD)

Model	Input	Output
gpt-4o-mini	$0.15	$0.60
gpt-4o	$2.50	$10.00
claude-3-5-sonnet	$3.00	$15.00
claude-opus-4-1	$15.00	$75.00
Ollama (local)	$0.00	$0.00

Tech Stack

Language/Platform: Kotlin 1.9.25, JVM target 17, IntelliJ Platform 2024.1.7 (IC)
UI: Native IntelliJ Swing components (no webview)
Core/Transport: In-process CoreApiRouter (optional Ktor server for CLI, planned v1.1+)
LLM/HTTP: Ktor Client 2.3.7 with 6 provider adapters
Database: SQLite (WAL) via Exposed 0.46.0 + sqlite-jdbc 3.44.1.0
Serialization: Gson 2.10.1, kotlinx-serialization-json 1.6.2, KAML 0.55.0
Markdown: commonmark 0.21.0
Logging: kotlin-logging + logback 1.4.14
Build: Gradle IntelliJ Plugin 1.17.4

Getting Started

Prerequisites

JDK 17 and IntelliJ IDEA 2024.x
Ollama from https://ollama.com/

Required models:

ollama pull nomic-embed-text:latest  # Embeddings
ollama pull qwen2.5-coder:14b        # Coding

Installation

# Clone repository
git clone https://github.com/shadoq/refio.git && cd refio

# Run in sandbox IDE
cd agent/plugin
./gradlew runIde          # Linux/macOS
.\gradlew.bat runIde      # Windows

# Or build plugin ZIP
./gradlew buildPlugin     # Output: build/distributions/refio-<version>.zip

First Steps

Open the Refio tool window (View → Tool Windows → Refio)
Select execution mode (Chat/Plan/Agent)
Choose LLM model from dropdown
Type your prompt with optional @mentions
Press Enter or click Send

Configuration

File Locations

File	Location	Purpose
User config	`~/.refio/config.yaml`	Personal settings, API keys
Project config	`<project>/.refio/config.yaml`	Project-specific prompts, MCP servers
Database	`refio_poc.db` (project root)	Session data, messages, RAG index
Ignore patterns	`<project>/.aiignore`	RAG indexing exclusions

Quick Start Config

# ~/.refio/config.yaml
general:
  formatMarkdown: true
  streamingEnabled: true

providers:
  ollama:
    endpoint: "http://localhost:11434"
  anthropic:
    apiKey: "sk-ant-..."

models:
  defaults:
    chat: "ollama/qwen2.5:7b"
    coding: "ollama/qwen2.5-coder:7b"
    embedding: "ollama/nomic-embed-text"

  visibility:
    "ollama/qwen2.5:7b": true
    "anthropic/claude-3-5-sonnet-20241022": true

See docs/config.md for full configuration reference.

Security

Defense Layers

Layer	Protection
PathSandbox	All file ops restricted to project root
FileLimits	Size limits (2MB), excluded directories (24), extensions (34)
CommandDenylist	58 dangerous patterns blocked (rm -rf, sudo, curl\|sh)
ToolPermissions	Per-mode access control (PLAN=read-only, AGENT=read-write)
No-Egress Mode	Blocks cloud providers, allows only Ollama/LM Studio
Secret Redaction	API keys masked in all logs

Known Issues (P0)

Issue	Description	Mitigation
Symlink Escape	PathSandbox can be bypassed via symlinks	Detection + logging in place
Denylist Bypass	Terminal command patterns may be circumvented	Tool DISABLED by default

Project Status

Version: 0.0.1 (Full Kotlin plugin + embedded core)

CLI: Planned for v1.1+

Test Coverage: 0% (migration in progress)

Roadmap

v0.1 (Current)

Complete architecture cleanup
Fix P0 security issues
Add Kotlin tests
RAG file watcher

v0.2 (Planned)

CLI parity (Ktor wrapper)
Export/import flows
Better context preview
Cost controls

Available Scripts

cd agent/plugin

# Development
./gradlew runIde              # Run in sandbox IDE
./gradlew buildPlugin         # Build ZIP distribution

# Quality
./gradlew detekt              # Static analysis
./gradlew ktlintCheck         # Lint check
./gradlew ktlintFormat        # Auto-format

# Testing
./gradlew test                # Run all tests

Repository Structure

src/main/kotlin/pl/jclab/refio/
├── core/                     # Embedded core (no IDE dependencies)
│   ├── api/                  # Router layer (9 domain routers)
│   ├── context/              # Context providers + MCP
│   ├── db/                   # Database layer (Exposed ORM)
│   ├── llm/                  # LLM integration (6 adapters)
│   ├── services/             # Core services (RAG, context, analysis)
│   ├── subagents/            # Subagent system
│   ├── tools/                # Tool system (10 implementations)
│   └── prompts/              # Prompt templates
├── services/                 # Plugin services (project-scoped)
│   ├── session/              # SessionManager (6 components)
│   └── rag/                  # Background indexing
└── ui/                       # IntelliJ UI components
    ├── toolwindow/           # Tool window factory
    ├── components/           # Chat, toolbar, autocomplete
    └── settings/             # 12+ settings panels

License

MIT License. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
docs		docs
gradle/wrapper		gradle/wrapper
src		src
.aiignore		.aiignore
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
build.gradle.kts		build.gradle.kts
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle.kts		settings.gradle.kts
start-plugin.bat		start-plugin.bat
start-plugin.sh		start-plugin.sh

License

shadoq/refio

Folders and files

Latest commit

History

Repository files navigation

Refio

Table of Contents

Project Overview

Design Philosophy

Experimental Nature

Key Features

How Refio Works

Architecture Overview

Message Flow

Execution Modes

Chat Mode

Plan Mode

Agent Mode

Performance Optimizations (ADR-0028)

Context System

Built-in Context Providers

Provider Types

Context Building Flow

RAG System

Indexing Pipeline

Search Pipeline

Language Analyzers

Tools

READ_ONLY Tools (5)

WRITE Tools (5)

Tool Availability by Mode

Subagents

Built-in Agents

Custom Agents

Model Aliases

MCP Protocol

16 Built-in Presets

Configuration

LLM Providers

Pricing (per 1M tokens, USD)

Tech Stack

Getting Started

Prerequisites

Installation

First Steps

Configuration

File Locations

Quick Start Config

Security

Defense Layers

Known Issues (P0)

Project Status

Roadmap

Available Scripts

Repository Structure

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages