A comprehensive Python-based tool for migrating complete MediaWiki sites to Wiki.js, preserving page organization, internal links, images, and categories. Designed for reliability with checkpoint/resume functionality and automatic authentication management.
- Complete Export: Extract all pages, images, and metadata from MediaWiki (v1.13.5+)
- Content Transformation: Convert wikitext to markdown with automatic link and image reference updates
- Intelligent Import: Create pages and upload images to Wiki.js via GraphQL API
- Resumable Operations: Checkpoint every 10 pages for large wiki migrations
- Flexible Authentication: Supports both public wikis (anonymous access) and private wikis (automatic reconnection on timeout)
- Dry-Run Mode: Preview operations without making changes
- Link Depth Control: Configurable BFS traversal for selective export
- Error Recovery: Comprehensive logging with CSV error reports
- Python: 3.9 or higher (3.11+ recommended for performance)
- Pandoc: System-level installation required for wikitextβmarkdown conversion
- Network: HTTP/HTTPS access to both MediaWiki and Wiki.js instances
macOS (via Homebrew):
brew install pandocLinux (Debian/Ubuntu):
sudo apt-get install pandocWindows: Download from https://pandoc.org/installing.html
- Requirements
- Installation
- Quick Start
- Usage Examples
- Documentation
- Project Structure
- Troubleshooting
- Contributing
- License
-
Clone the repository:
git clone https://github.com/sbonaime/mediawiki2wikijs.git cd mediawiki2wikijs -
Create virtual environment:
python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Configure settings:
cp .env.example .env # Edit .env with your MediaWiki URL and Wiki.js credentials # Note: MediaWiki username/password are optional (leave empty for public wikis)
python src/export_mediawiki.pyThis will:
- Connect to your MediaWiki instance
- Export all pages to
export_output/directory - Download all images
- Convert content to markdown
- Save checkpoint every 10 pages
python src/import_wikijs.py --source ./export_outputThis will:
- Connect to your Wiki.js instance
- Create all pages with markdown content
- Upload all images
- Update internal links
Edit your .env file with these settings:
# MediaWiki Configuration
MEDIAWIKI_URL=https://wiki.example.com
# Optional: For private wikis that require authentication
# Leave empty for public wikis
MEDIAWIKI_USERNAME=YourUsername
MEDIAWIKI_PASSWORD=YourPassword
# Wiki.js Configuration (required)
WIKIJS_URL=https://newwiki.example.com
WIKIJS_API_KEY=your-api-key-here
# Optional Settings
EXPORT_DIR=./export_output
CHECKPOINT_FREQUENCY=10
MAX_LINK_DEPTH=-1
LOG_LEVEL=INFOPublic vs Private Wikis:
- Public wikis: Leave
MEDIAWIKI_USERNAMEandMEDIAWIKI_PASSWORDempty or remove them. The tool will connect anonymously. - Private wikis: Provide valid credentials. The tool will automatically handle authentication and reconnection on timeout.
# Dry run (preview without downloading)
python src/export_mediawiki.py --dry-run
# Limit link depth
python src/export_mediawiki.py --link-depth 2
# Resume interrupted export
python src/export_mediawiki.py --resume
# Export specific namespaces
python src/export_mediawiki.py --namespaces 0,2
# Verbose logging
python src/export_mediawiki.py --verbose# Dry run (preview without creating pages)
python src/import_wikijs.py --source ./export_output --dry-run
# Skip existing pages
python src/import_wikijs.py --source ./export_output --skip-existing
# Force overwrite existing pages
python src/import_wikijs.py --source ./export_output --force
# Resume interrupted import
python src/import_wikijs.py --source ./export_output --resume- Quickstart Guide: Detailed installation and usage instructions
- Feature Specification: Complete feature requirements
- Implementation Plan: Technical architecture and design
- API Contracts: MediaWiki and Wiki.js API documentation
src/
βββ export_mediawiki.py # Export CLI script
βββ import_wikijs.py # Import CLI script
βββ lib/ # Shared library code
β βββ config.py # Configuration management
β βββ logger.py # Logging utilities
β βββ mediawiki_client.py # MediaWiki API client
β βββ wikijs_client.py # Wiki.js GraphQL client
β βββ content_transformer.py # Markup conversion
β βββ storage_manager.py # File system operations
β βββ image_processor.py # Image handling
β βββ auth_manager.py # Authentication management
βββ models/ # Data models
βββ wiki_page.py # Page entity
βββ image_asset.py # Image entity
βββ link_reference.py # Link relationship
βββ checkpoint.py # Checkpoint state
βββ migration_report.py # Migration report
tests/
βββ unit/ # Unit tests
βββ integration/ # Integration tests
βββ contract/ # API contract tests
If you see "Authentication failed: Session expired", the script will automatically reconnect after 5 minutes of inactivity.
Install Pandoc system-wide (see Requirements section above).
Check that your MediaWiki bot user has read permissions on the File namespace.
Use --skip-existing flag to skip existing pages, or --force to overwrite.
Contributions are welcome! Here's how you can help:
- Fork the repository
- Create a feature branch:
git checkout -b feature/amazing-feature - Make your changes: Follow existing code style and add tests
- Commit:
git commit -m 'Add amazing feature' - Push:
git push origin feature/amazing-feature - Open a Pull Request
See tasks.md for planned features and open tasks.
# Install development dependencies
pip install -r requirements.txt
# Run tests
pytest tests/
# Run linting
ruff check src/See CHANGELOG.md for a detailed history of changes and releases.
This project is licensed under the MIT License - see the LICENSE file for details.
For issues and questions:
- π Review the Quickstart Guide
- π§ Check the Troubleshooting section
- π Create a GitHub issue with error logs and
export_metadata.json
- MediaWiki API: Powered by mwclient
- Wiki.js GraphQL: Using gql
- Pandoc: Universal document converter by John MacFarlane
Project Status: β MVP Complete (70% of planned features implemented)
- β Phase 1-2: Setup and foundational infrastructure
- β Phase 3: MediaWiki export with BFS traversal
- β Phase 4: Content transformation (wikitextβmarkdown)
- β Phase 5: Wiki.js import with GraphQL
- β³ Phase 6: Verification tools (planned)
- β³ Phase 7: Testing and polish (planned)
See tasks.md for detailed progress tracking.