Largefile MCP Server

Navigate, search, and edit large codebases, logs, and data files that exceed AI context limits.

Why Largefile?

Go beyond context limits - Read, search, and edit files too large to fit in AI context windows
Semantic code navigation - Tree-sitter extracts functions/classes for Python, JS/TS, Rust, Go
Fewer LLM errors - Search/replace editing eliminates line number mistakes common with line-based edits
Smart search - Fuzzy matching, regex, case-insensitive, inverted, and count-only modes
No size limits - Handles multi-GB files via tiered memory strategy (RAM → mmap → streaming)

Quick Start

Prerequisite: Install uv for the uvx command.

{
  "mcpServers": {
    "largefile": {
      "command": "uvx",
      "args": ["--from", "largefile", "largefile-mcp"]
    }
  }
}

Tools

Tool	Use For
`get_overview`	File structure and semantic outline before diving in
`search_content`	Finding patterns, counting occurrences, regex matching
`read_content`	Reading specific sections; tail/head modes for logs
`edit_content`	Safe search/replace with automatic backups
`revert_edit`	Recovering from bad edits

When to Use Largefile

Use when:

File exceeds ~1000 lines or 100KB (supports multi-GB files)
Navigating large codebases with semantic structure
Analyzing log files (especially recent entries with tail mode)
Making search/replace edits across large files
Counting occurrences without loading full content

Don't use for:

Small files that fit in context (AI doesn't need help with those)
Binary files (images, executables, compressed)

Usage Examples

Large Codebase Navigation

# Get semantic structure of a large Python file
overview = get_overview("/path/to/large_module.py")
# Returns: 2,847 lines, 15 classes, function outline via Tree-sitter

# Find all class definitions
classes = search_content("/path/to/large_module.py", "class ", fuzzy=False)

# Read complete class with semantic chunking
code = read_content("/path/to/large_module.py", pattern="class UserModel", mode="semantic")

Batch Refactoring

# Preview rename across file
preview = edit_content("/path/to/api.py", changes=[
    {"search": "process_data", "replace": "transform_data"},
    {"search": "old_endpoint", "replace": "new_endpoint"}
], preview=True)

# Apply changes (creates automatic backup)
result = edit_content("/path/to/api.py", changes=[...], preview=False)

# Undo if needed
revert_edit("/path/to/api.py")

Log Analysis

# Get log file overview
overview = get_overview("/var/log/app.log")
# Returns: 150,000 lines, 2.1GB

# Read last 500 lines efficiently
recent = read_content("/var/log/app.log", limit=500, mode="tail")

# Count errors without loading content
error_count = search_content("/var/log/app.log", "ERROR", count_only=True, fuzzy=False)

# Find errors with regex
errors = search_content("/var/log/app.log", r"ERROR.*timeout", regex=True)

Supported Languages

Tree-sitter semantic analysis for: Python, JavaScript/JSX, TypeScript/TSX, Rust, Go

Other file types use text-based analysis with graceful fallback.

File Size Handling

Size	Strategy
< 50MB	Full memory loading with AST caching
50-500MB	Memory-mapped access
> 500MB	Streaming (tail/head modes recommended)

Configuration

Environment variables for tuning:

LARGEFILE_MEMORY_THRESHOLD_MB=50      # RAM loading limit
LARGEFILE_MMAP_THRESHOLD_MB=500       # Memory mapping limit
LARGEFILE_FUZZY_THRESHOLD=0.8         # Match sensitivity (0.0-1.0)
LARGEFILE_MAX_SEARCH_RESULTS=20       # Results per search
LARGEFILE_BACKUP_DIR=~/.largefile/backups

Documentation

API Reference - Detailed tool documentation
Configuration Guide - All environment variables
Examples - More workflow examples
Design Document - Architecture details
Contributing - Development setup

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.github/workflows		.github/workflows
docs		docs
scripts		scripts
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
codecov.yml		codecov.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Largefile MCP Server

Why Largefile?

Quick Start

Tools

When to Use Largefile

Usage Examples

Large Codebase Navigation

Batch Refactoring

Log Analysis

Supported Languages

File Size Handling

Configuration

Documentation

About

Uh oh!

Releases 6

Languages

peteretelej/largefile

Folders and files

Latest commit

History

Repository files navigation

Largefile MCP Server

Why Largefile?

Quick Start

Tools

When to Use Largefile

Usage Examples

Large Codebase Navigation

Batch Refactoring

Log Analysis

Supported Languages

File Size Handling

Configuration

Documentation

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 6

Languages