Archive your Twitter/X data locally - likes, bookmarks, tweets, reposts, and home feed.
TweetHoarder uses cookie-based authentication to access Twitter's internal GraphQL API (no paid API key required), storing everything in a local SQLite database for offline access and search.
- Sync your data: Likes, bookmarks (with folders), tweets, reposts, replies, and home feed
- Quote tweets: Full support for quoted tweets in sync and all export formats
- Thread expansion: Archive full threads and conversations with context
- Rich HTML export: Twitter-style design with dark/light themes, virtual scrolling, search/filter, and copy-as-markdown
- Multiple exports: JSON, Markdown, CSV, and searchable HTML
- Resume support: Checkpointing allows interrupted syncs to continue
- Browser cookie extraction: Auto-detect from Firefox or Chrome
- Rate limit handling: Adaptive backoff prevents API bans
- Offline-first: All data stored locally in SQLite
# Clone the repository
git clone https://github.com/yourusername/tweethoarder.git
cd tweethoarder
# Install with uv
uv sync
# Or install as a package
uv pip install -e .# First run - auto-detects cookies from Firefox/Chrome
tweethoarder sync likes
# Sync other collections
tweethoarder sync bookmarks
tweethoarder sync tweets --count 500
tweethoarder sync reposts --all
# Sync your home timeline (last 24 hours)
tweethoarder sync feed
# Export your data
tweethoarder export json --collection likes --output ~/likes.json
tweethoarder export html --collection bookmarks # Twitter-style HTML viewer
tweethoarder export html --collection all # Combined export with type filters
# View statistics
tweethoarder statsInterrupted syncs automatically resume from the last checkpoint.
# Sync likes (default: 100 tweets, use --all for unlimited)
tweethoarder sync likes [--count N] [--all]
# Sync bookmarks from all folders
tweethoarder sync bookmarks [--count N] [--all]
# Sync your own tweets
tweethoarder sync tweets [--count N] [--all]
# Sync reposts (retweets)
tweethoarder sync reposts [--count N] [--all]
# Sync replies
tweethoarder sync replies [--count N] [--all]
# Sync tweets + reposts in a single efficient API call
tweethoarder sync posts [--count N] [--all]
# Sync home timeline (Following feed)
tweethoarder sync feed [--hours N] # Default: last 24 hours
# Sync with thread expansion (archives full threads for each tweet)
tweethoarder sync likes --with-threads [--thread-mode thread|conversation]# Fetch thread for a specific tweet (author's chain only)
tweethoarder thread 1234567890
# Fetch full conversation including all replies
tweethoarder thread 1234567890 --mode conversation --limit 200Thread vs Conversation:
- Thread: Same author's self-reply chain (classic "tweetstorm")
- Conversation: All tweets in the discussion, including other participants
# Export to JSON
tweethoarder export json [--collection TYPE] [--output PATH]
# Export to Markdown (thread-aware with quote tweet support)
tweethoarder export markdown [--collection TYPE] [--output PATH]
# Export to CSV
tweethoarder export csv [--collection TYPE] [--output PATH]
# Export specific bookmark folder
tweethoarder export json --collection bookmarks --folder "Work"
# Export to HTML (Twitter-style viewer)
tweethoarder export html [--collection TYPE] [--output PATH]
# Combined export with all collection types
tweethoarder export html --collection allCollection types: likes, bookmarks, tweets, reposts, replies, posts, all
The HTML export generates a single-file Twitter-style viewer:
- Theme switcher: Toggle between light and dark modes
- Virtual scrolling: Smoothly handles thousands of tweets
- Search & filter: Full-text search with author filters
- Type filtering: When using
--collection all, filter by likes/bookmarks/tweets/etc. - Copy as Markdown: Click any tweet to copy as formatted markdown
- Quote tweets: Visual display of quoted tweets with proper attribution
- Media support: Embedded images with expandable views
- Clickable links: @mentions link to Twitter profiles, URLs are clickable
# Show sync statistics
tweethoarder stats
# Force refresh Twitter API query IDs
tweethoarder refresh-ids
# View/modify configuration
tweethoarder config show
tweethoarder config set sync.default_tweet_count 500TweetHoarder automatically extracts cookies from your browser. Priority order:
- Environment variables:
TWITTER_AUTH_TOKEN,TWITTER_CT0,TWITTER_TWID - Config file:
~/.config/tweethoarder/config.toml - Firefox: Auto-detect from
~/.mozilla/firefox/*/cookies.sqlite - Chrome: Auto-detect with keyring decryption
If auto-detection fails, you can set cookies manually:
# Via environment variables
export TWITTER_AUTH_TOKEN="your_auth_token"
export TWITTER_CT0="your_ct0_token"
# Or in config file (~/.config/tweethoarder/config.toml)
[auth]
auth_token = "your_auth_token"
ct0 = "your_ct0_token"To find your cookies:
- Open Twitter/X in your browser
- Open Developer Tools (F12) > Application > Cookies
- Copy values for
auth_tokenandct0
All data is stored in SQLite at ~/.local/share/tweethoarder/tweethoarder.db
- tweets: All tweet content and metadata (including quote tweets)
- collections: Which tweets belong to which collection (likes, bookmarks, tweets, reposts, replies, feed)
- threads: Thread/conversation metadata
- sync_progress: Checkpoints for resumable syncs
-- Find all liked tweets from a specific author
SELECT t.* FROM tweets t
JOIN collections c ON t.id = c.tweet_id
WHERE c.collection_type = 'like' AND t.author_username = 'elonmusk';
-- Get highly-engaged tweets in your likes
SELECT t.*, (t.like_count + t.retweet_count) as engagement
FROM tweets t
JOIN collections c ON t.id = c.tweet_id
WHERE c.collection_type = 'like'
ORDER BY engagement DESC LIMIT 50;
-- Get recent tweets from your home feed
SELECT t.* FROM tweets t
JOIN collections c ON t.id = c.tweet_id
WHERE c.collection_type = 'feed'
ORDER BY t.created_at DESC LIMIT 100;See SPEC.md for comprehensive SQL query examples.
Config file location: ~/.config/tweethoarder/config.toml
[auth]
cookie_sources = ["firefox", "chrome"] # Priority order
[sync]
default_tweet_count = 100
request_delay_ms = 500
max_retries = 5
[export]
export_dir = "~/tweethoarder-exports"
[display]
show_progress = true
verbose = false# Install dependencies
just setup
# Or manually:
uv sync --dev
uv run prek installjust test # Run tests
just lint # Check code quality
just format # Format code
just ci # Run full CI pipelinetweethoarder/
├── src/tweethoarder/
│ ├── cli/ # Typer CLI commands
│ ├── client/ # Twitter API client
│ ├── auth/ # Cookie extraction
│ ├── query_ids/ # Query ID management
│ ├── storage/ # SQLite database
│ └── export/ # Export formatters
├── tests/ # Unit tests
├── SPEC.md # Detailed specification
└── CLAUDE.md # Development guidelines
TweetHoarder uses Twitter's internal GraphQL API (the same one the web app uses):
- Authentication: Extracts session cookies from your browser
- Query IDs: Discovers Twitter's rotating GraphQL operation IDs from JS bundles
- Fetching: Paginates through your likes/bookmarks/tweets with rate limiting
- Storage: Saves everything to SQLite with full metadata
- Export: Generates various output formats for offline viewing
The architecture is ported from bird, a TypeScript Twitter client.
- Make sure you're logged into Twitter in your browser
- Try closing the browser completely before running
- For Chrome, ensure GNOME Keyring or KDE Wallet is accessible
Twitter periodically rotates their API identifiers. Run:
tweethoarder refresh-idsTweetHoarder uses adaptive rate limiting. If you hit limits:
- Wait a few minutes and retry (syncs auto-resume from checkpoint)
- Increase
request_delay_msin config for slower but safer syncing
MIT
- bird - TypeScript Twitter client (reference implementation)
- python-copier-template - Project template