Library-first tool for parsing AI conversation exports with search, filtering, and markdown export
Echomine is a Python library and CLI tool for parsing, searching, and exporting AI conversation exports. Built with a multi-provider adapter pattern, it currently supports OpenAI ChatGPT and Anthropic Claude exports, with extensibility for future AI platforms (Gemini, etc.).
- Memory Efficient: Stream-based parsing handles 1GB+ files with constant memory usage
- Advanced Search: BM25 relevance ranking with exact phrase matching, boolean logic, role filtering, and keyword exclusion
- Message Snippets: Automatic preview generation for search results with match context
- Statistics & Analytics: Calculate export statistics, conversation metrics, and temporal patterns
- Rich CLI Output: Color-coded terminal formatting, tables, progress bars, and syntax highlighting
- Multiple Export Formats: Export to Markdown (with YAML frontmatter), JSON, or CSV
- Type Safe: Strict typing with Pydantic v2 and mypy --strict compliance
- Library First: All CLI capabilities available as importable Python library
- Multi-Provider Support: OpenAI ChatGPT and Anthropic Claude exports with auto-detection
- Library-First Architecture: CLI built on top of library, not vice versa
- Strict Type Safety: mypy --strict, no
Anytypes in public API - Memory Efficiency: Stream-based parsing, never load entire file into memory
- Test-Driven Development: All features test-first validated
- YAGNI: Simple solutions, no speculative features
See Constitution for complete design principles.
# Clone repository
git clone https://github.com/echomine/echomine.git
cd echomine
# Install with development dependencies
pip install -e ".[dev]"
# Install pre-commit hooks (optional)
pre-commit installpip install echominefrom echomine import OpenAIAdapter, ClaudeAdapter, SearchQuery
from pathlib import Path
# Initialize adapter for your provider (stateless, reusable)
adapter = OpenAIAdapter() # For ChatGPT exports
# adapter = ClaudeAdapter() # For Claude exports
export_file = Path("conversations.json")
# 1. List all conversations (discovery)
for conversation in adapter.stream_conversations(export_file):
print(f"[{conversation.created_at.date()}] {conversation.title}")
print(f" Messages: {len(conversation.messages)}")
# 2. Search with keywords (BM25 ranking)
query = SearchQuery(keywords=["algorithm", "design"], limit=10)
for result in adapter.search(export_file, query):
print(f"{result.conversation.title} (score: {result.score:.2f})")
print(f" Preview: {result.snippet}") # v1.1.0: automatic snippets
# 3. Advanced search with filters (v1.1.0+)
from datetime import date
query = SearchQuery(
keywords=["refactor"],
phrases=["algo-insights"], # Exact phrase matching
match_mode="all", # Require ALL keywords (AND logic)
exclude_keywords=["test"], # Filter out unwanted results
role_filter="user", # Search only user messages
from_date=date(2024, 1, 1),
to_date=date(2024, 3, 31),
limit=5
)
for result in adapter.search(export_file, query):
print(f"[{result.score:.2f}] {result.conversation.title}")
print(f" Snippet: {result.snippet}")
# 4. Calculate statistics (v1.2.0+)
from echomine import calculate_statistics
stats = calculate_statistics(export_file)
print(f"Total conversations: {stats.total_conversations}")
print(f"Total messages: {stats.total_messages}")
print(f"Average messages: {stats.average_messages:.1f}")
# 5. Get specific conversation by ID
conversation = adapter.get_conversation_by_id(export_file, "conv-abc123")
if conversation:
print(f"Found: {conversation.title}")# Auto-detect provider (default - works for both OpenAI and Claude)
echomine list export.json
# Explicit provider selection (v1.3.0+)
echomine list export.json --provider claude
echomine list export.json --provider openai
# Search by keywords
echomine search export.json --keywords "algorithm,design" --limit 10
# Search by exact phrase (v1.1.0+)
echomine search export.json --phrase "algo-insights"
# Boolean match mode: require ALL keywords (v1.1.0+)
echomine search export.json -k "python" -k "async" --match-mode all
# Exclude unwanted results (v1.1.0+)
echomine search export.json -k "python" --exclude "django" --exclude "flask"
# Role filtering: search only user/assistant messages (v1.1.0+)
echomine search export.json -k "refactor" --role user
# Combine all filters (v1.1.0+)
echomine search export.json --phrase "api" -k "python" --exclude "test" --role user --match-mode all
# Search by title (fast, metadata-only)
echomine search export.json --title "Project"
# Filter by date range
echomine search export.json --from-date "2024-01-01" --to-date "2024-03-31"
# View export statistics (v1.2.0+)
echomine stats export.json
# Get conversation by ID (v1.2.0+)
echomine get export.json conv-abc123
# Export conversation to markdown with YAML frontmatter (v1.2.0+)
echomine export export.json conv-abc123 --output algo.md
# Export as JSON
echomine export export.json conv-abc123 --format json --output algo.json
# Export as CSV (v1.2.0+)
echomine export export.json conv-abc123 --format csv --output algo.csv
# JSON output for search results
echomine search export.json --keywords "python" --json | jq '.results[].title'
# Version info
echomine --versionSearch Filter Logic: Content matching (phrases OR keywords) happens first, then post-filtering (--exclude, --role, --title, dates) is applied. See CLI Usage for details.
See Quickstart Guide for detailed examples.
- Python 3.12 or higher
- Git
# Clone repository
git clone https://github.com/echomine/echomine.git
cd echomine
# Install with development dependencies
pip install -e ".[dev]"
# Install pre-commit hooks
pre-commit install# Run all tests
pytest
# Run with coverage
pytest --cov=echomine --cov-report=html
# Run specific test categories
pytest -m unit # Unit tests only
pytest -m integration # Integration tests only
pytest -m contract # Contract tests only
pytest -m performance # Performance benchmarks# Type checking (strict mode)
mypy src/
# Linting and formatting
ruff check .
ruff format .
# Run pre-commit hooks manually
pre-commit run --all-filesechomine/
├── src/echomine/ # Library source code
│ ├── models/ # Pydantic data models
│ ├── adapters/ # Provider adapters (OpenAI, etc.)
│ ├── parsers/ # Streaming JSON parsers
│ ├── search/ # Search and ranking logic
│ ├── exporters/ # Export formatters (markdown, JSON)
│ └── cli/ # CLI commands
├── tests/ # Test suite
│ ├── unit/ # Unit tests
│ ├── integration/ # Integration tests
│ ├── contract/ # Protocol contract tests
│ └── performance/ # Performance benchmarks
└── specs/ # Design documents
└── 001-ai-chat-parser/ # Feature specification
Full Documentation - Comprehensive guides, API reference, and examples
Echomine is designed for memory efficiency and speed:
- Memory: O(1) memory usage regardless of file size (streaming-based)
- Search: <30 seconds for 1.6GB files (10K conversations, 50K messages)
- Listing: <5 seconds for 10K conversations
See Performance Requirements for benchmarks.
Contributions are welcome! Please see CONTRIBUTING.md for:
- Development setup and prerequisites
- TDD workflow (RED-GREEN-REFACTOR cycle mandatory)
- Testing guidelines (pytest, mypy --strict, ruff)
- Code quality standards and conventions
- Commit message format (conventional commits)
- Pull request process
AGPL-3.0 License - See LICENSE file for details
Built with: