Echomine

Library-first tool for parsing AI conversation exports with search, filtering, and markdown export

Overview

Echomine is a Python library and CLI tool for parsing, searching, and exporting AI conversation exports. Built with a multi-provider adapter pattern, it currently supports OpenAI ChatGPT and Anthropic Claude exports, with extensibility for future AI platforms (Gemini, etc.).

Key Features

Memory Efficient: Stream-based parsing handles 1GB+ files with constant memory usage
Advanced Search: BM25 relevance ranking with exact phrase matching, boolean logic, role filtering, and keyword exclusion
Message Snippets: Automatic preview generation for search results with match context
Statistics & Analytics: Calculate export statistics, conversation metrics, and temporal patterns
Rich CLI Output: Color-coded terminal formatting, tables, progress bars, and syntax highlighting
Multiple Export Formats: Export to Markdown (with YAML frontmatter), JSON, or CSV
Type Safe: Strict typing with Pydantic v2 and mypy --strict compliance
Library First: All CLI capabilities available as importable Python library
Multi-Provider Support: OpenAI ChatGPT and Anthropic Claude exports with auto-detection

Design Principles

Library-First Architecture: CLI built on top of library, not vice versa
Strict Type Safety: mypy --strict, no Any types in public API
Memory Efficiency: Stream-based parsing, never load entire file into memory
Test-Driven Development: All features test-first validated
YAGNI: Simple solutions, no speculative features

See Constitution for complete design principles.

Installation

From Source

# Clone repository
git clone https://github.com/echomine/echomine.git
cd echomine

# Install with development dependencies
pip install -e ".[dev]"

# Install pre-commit hooks (optional)
pre-commit install

From PyPI (when published)

pip install echomine

Quick Start

Library API (Primary Interface)

from echomine import OpenAIAdapter, ClaudeAdapter, SearchQuery
from pathlib import Path

# Initialize adapter for your provider (stateless, reusable)
adapter = OpenAIAdapter()  # For ChatGPT exports
# adapter = ClaudeAdapter()  # For Claude exports
export_file = Path("conversations.json")

# 1. List all conversations (discovery)
for conversation in adapter.stream_conversations(export_file):
    print(f"[{conversation.created_at.date()}] {conversation.title}")
    print(f"  Messages: {len(conversation.messages)}")

# 2. Search with keywords (BM25 ranking)
query = SearchQuery(keywords=["algorithm", "design"], limit=10)
for result in adapter.search(export_file, query):
    print(f"{result.conversation.title} (score: {result.score:.2f})")
    print(f"  Preview: {result.snippet}")  # v1.1.0: automatic snippets

# 3. Advanced search with filters (v1.1.0+)
from datetime import date
query = SearchQuery(
    keywords=["refactor"],
    phrases=["algo-insights"],  # Exact phrase matching
    match_mode="all",  # Require ALL keywords (AND logic)
    exclude_keywords=["test"],  # Filter out unwanted results
    role_filter="user",  # Search only user messages
    from_date=date(2024, 1, 1),
    to_date=date(2024, 3, 31),
    limit=5
)
for result in adapter.search(export_file, query):
    print(f"[{result.score:.2f}] {result.conversation.title}")
    print(f"  Snippet: {result.snippet}")

# 4. Calculate statistics (v1.2.0+)
from echomine import calculate_statistics

stats = calculate_statistics(export_file)
print(f"Total conversations: {stats.total_conversations}")
print(f"Total messages: {stats.total_messages}")
print(f"Average messages: {stats.average_messages:.1f}")

# 5. Get specific conversation by ID
conversation = adapter.get_conversation_by_id(export_file, "conv-abc123")
if conversation:
    print(f"Found: {conversation.title}")

CLI Usage (Built on Library)

# Auto-detect provider (default - works for both OpenAI and Claude)
echomine list export.json

# Explicit provider selection (v1.3.0+)
echomine list export.json --provider claude
echomine list export.json --provider openai

# Search by keywords
echomine search export.json --keywords "algorithm,design" --limit 10

# Search by exact phrase (v1.1.0+)
echomine search export.json --phrase "algo-insights"

# Boolean match mode: require ALL keywords (v1.1.0+)
echomine search export.json -k "python" -k "async" --match-mode all

# Exclude unwanted results (v1.1.0+)
echomine search export.json -k "python" --exclude "django" --exclude "flask"

# Role filtering: search only user/assistant messages (v1.1.0+)
echomine search export.json -k "refactor" --role user

# Combine all filters (v1.1.0+)
echomine search export.json --phrase "api" -k "python" --exclude "test" --role user --match-mode all

# Search by title (fast, metadata-only)
echomine search export.json --title "Project"

# Filter by date range
echomine search export.json --from-date "2024-01-01" --to-date "2024-03-31"

# View export statistics (v1.2.0+)
echomine stats export.json

# Get conversation by ID (v1.2.0+)
echomine get export.json conv-abc123

# Export conversation to markdown with YAML frontmatter (v1.2.0+)
echomine export export.json conv-abc123 --output algo.md

# Export as JSON
echomine export export.json conv-abc123 --format json --output algo.json

# Export as CSV (v1.2.0+)
echomine export export.json conv-abc123 --format csv --output algo.csv

# JSON output for search results
echomine search export.json --keywords "python" --json | jq '.results[].title'

# Version info
echomine --version

Search Filter Logic: Content matching (phrases OR keywords) happens first, then post-filtering (--exclude, --role, --title, dates) is applied. See CLI Usage for details.

See Quickstart Guide for detailed examples.

Development

Prerequisites

Python 3.12 or higher
Git

Setup Development Environment

# Clone repository
git clone https://github.com/echomine/echomine.git
cd echomine

# Install with development dependencies
pip install -e ".[dev]"

# Install pre-commit hooks
pre-commit install

Running Tests

# Run all tests
pytest

# Run with coverage
pytest --cov=echomine --cov-report=html

# Run specific test categories
pytest -m unit           # Unit tests only
pytest -m integration    # Integration tests only
pytest -m contract       # Contract tests only
pytest -m performance    # Performance benchmarks

Code Quality

# Type checking (strict mode)
mypy src/

# Linting and formatting
ruff check .
ruff format .

# Run pre-commit hooks manually
pre-commit run --all-files

Project Structure

echomine/
├── src/echomine/           # Library source code
│   ├── models/             # Pydantic data models
│   ├── adapters/           # Provider adapters (OpenAI, etc.)
│   ├── parsers/            # Streaming JSON parsers
│   ├── search/             # Search and ranking logic
│   ├── exporters/          # Export formatters (markdown, JSON)
│   └── cli/                # CLI commands
├── tests/                  # Test suite
│   ├── unit/               # Unit tests
│   ├── integration/        # Integration tests
│   ├── contract/           # Protocol contract tests
│   └── performance/        # Performance benchmarks
└── specs/                  # Design documents
    └── 001-ai-chat-parser/ # Feature specification

Documentation

Full Documentation - Comprehensive guides, API reference, and examples

Quick Links

Spec Documents

Performance

Echomine is designed for memory efficiency and speed:

Memory: O(1) memory usage regardless of file size (streaming-based)
Search: <30 seconds for 1.6GB files (10K conversations, 50K messages)
Listing: <5 seconds for 10K conversations

See Performance Requirements for benchmarks.

Contributing

Contributions are welcome! Please see CONTRIBUTING.md for:

Development setup and prerequisites
TDD workflow (RED-GREEN-REFACTOR cycle mandatory)
Testing guidelines (pytest, mypy --strict, ruff)
Code quality standards and conventions
Commit message format (conventional commits)
Pull request process

License

AGPL-3.0 License - See LICENSE file for details

Acknowledgments

Built with:

Pydantic - Data validation and type safety
ijson - Streaming JSON parser
Typer - CLI framework
Rich - Terminal formatting
structlog - Structured logging

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.claude		.claude
.github		.github
.specify		.specify
docs		docs
examples		examples
specs		specs
src/echomine		src/echomine
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MAINTAINING.md		MAINTAINING.md
README.md		README.md
mkdocs.yml		mkdocs.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Echomine

Overview

Key Features

Design Principles

Installation

From Source

From PyPI (when published)

Quick Start

Library API (Primary Interface)

CLI Usage (Built on Library)

Development

Prerequisites

Setup Development Environment

Running Tests

Code Quality

Project Structure

Documentation

Quick Links

Spec Documents

Performance

Contributing

License

Acknowledgments

About

Uh oh!

Releases 4

Packages

Contributors 3

Uh oh!

Languages

License

aucontraire/echomine

Folders and files

Latest commit

History

Repository files navigation

Echomine

Overview

Key Features

Design Principles

Installation

From Source

From PyPI (when published)

Quick Start

Library API (Primary Interface)

CLI Usage (Built on Library)

Development

Prerequisites

Setup Development Environment

Running Tests

Code Quality

Project Structure

Documentation

Quick Links

Spec Documents

Performance

Contributing

License

Acknowledgments

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Contributors 3

Uh oh!

Languages

Packages