Skip to content

A self-hosted localhost web application for browsing 4chan boards with local caching, greentext detection, image previews, and reply hover functionality.

License

Notifications You must be signed in to change notification settings

user7210unix/Local-4chan-Scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🗂️ Local 4chan Scraper

A fast, efficient, and stable 4chan browser with smart caching.

Features

  • Smart Caching: Only thumbnails are cached by default - full images loaded on-demand
  • No Bloat: Temporary cache auto-cleans, no permanent storage bloat
  • Thread Filtering: Hide threads by keyword per board
  • History Tracking: Recent threads in sidebar
  • Optional Downloads: Enable download button in settings if needed
  • Fast & Stable: Rate-limited API requests, retry logic, LRU cache cleanup

🖼️ Preview

View Screenshots

Thread (Light Mode)

Thread Light Mode Preview

All Boards

All Boards Preview

Board Catalog

Board Catalog Preview

Filter Interface

Filter Interface Preview

Settings

Settings Panel Preview

Sidebar

Sidebar Preview

Installation

  1. Clone the repository
  2. Install dependencies:
pip install -r requirements.txt
  1. Run the application:
python3 app.py
  1. Open browser to http://127.0.0.1:5000

Project Structure

4chan-scraper/
├── app.py                  # Main Flask application
├── requirements.txt        # Python dependencies
├── .env                    # Environment variables (optional)
├── utils/
│   ├── config.py          # Configuration management
│   ├── database.py        # SQLite cache manager
│   ├── cache_manager.py   # Image cache manager
│   ├── api_client.py      # 4chan API client
│   ├── settings_manager.py # User settings
│   ├── history_manager.py  # Browsing history
│   └── filter_manager.py   # Thread filters
├── templates/
│   └── index.html         # Frontend UI
└── data/
    ├── cache/             # Temporary image cache (auto-cleanup)
    │   ├── thumbs/        # Thumbnail cache
    │   └── temp/          # On-demand full images
    ├── downloads/         # User downloads (optional)
    ├── chan.db            # SQLite database
    ├── settings.json      # User settings
    ├── history.json       # Browsing history
    └── filters.json       # Thread filters

Configuration

Optional environment variables in .env:

CACHE_TIME=10          # Cache TTL in minutes
MAX_CACHE_SIZE=500     # Max cache size in MB
HOST=127.0.0.1
PORT=5000
DEBUG=False

How It Works

Smart Caching System

  1. Thumbnails: Automatically cached when browsing catalogs
  2. Full Images: Only loaded on-demand when clicked, stored in temp cache
  3. LRU Cleanup: Least recently used files removed when cache limit reached
  4. Auto Expiry: Old cache files auto-deleted after 24 hours
  5. No Bloat: User doesn't accumulate unused files

Todo

  • Improve Wide ui (images are still centered)
  • Add favorite boards to the sidepanel (currently only hardcoded for testing purposes)

Download vs Cache

  • Cache: Temporary, auto-managed, for browsing only
  • Downloads: Permanent, user-initiated, stored separately (optional)

Usage

Browse Boards

  • Click any board from the grid to view its catalog

Thread Filtering

  1. Open a board catalog
  2. Click "Filters" in sidebar
  3. Add keywords to hide threads containing that text
  4. Filters are per-board and persistent

Settings

  • Theme: Light or dark mode
  • Download Button: Enable to show download buttons on images
  • Show Sticky: Toggle sticky thread visibility
  • Image Hover: Enable hover previews (future)

History

  • Recently viewed threads appear in sidebar
  • Click to quickly return to a thread
  • Clear all history with "Clear" button

Performance

  • Smart rate limiting (1 req/sec to 4chan API)
  • Automatic retry on failed requests
  • Threaded background thumbnail downloads
  • Efficient SQLite caching for API responses
  • LRU cleanup prevents disk bloat

Troubleshooting

Images not loading?

  • Check your internet connection
  • 4chan CDN might be slow - wait a few seconds

Cache too large?

  • Reduce MAX_CACHE_SIZE in .env
  • Click "Settings" → "Clear Cache"

API errors?

  • Rate limiting active - wait a few seconds
  • 4chan API might be down temporarily

License

MIT License - Free to use and modify

About

A self-hosted localhost web application for browsing 4chan boards with local caching, greentext detection, image previews, and reply hover functionality.

Topics

Resources

License

Stars

Watchers

Forks