A web-based evaluation platform for OCR (Optical Character Recognition) results, featuring real-time leaderboard and TEDS (Tree Edit Distance based Similarity) metric scoring.
This platform allows users to:
- Upload OCR prediction results in JSON format
- Automatically evaluate predictions against ground truth using TEDS metrics
- View real-time rankings on an interactive leaderboard with WebSocket progress updates
- Analyze detailed evaluation results with filtering and statistics
- Export results as CSV for further analysis
- Switch between Traditional Chinese and English interfaces
- Compare table recognition accuracy with other participants
- Administrators can manage submissions and delete entries through a secure dashboard
- TEDS Metric: Industry-standard Tree Edit Distance based Similarity for table structure evaluation
- Flexible Input: Supports both Markdown and HTML table formats
- Real-time Leaderboard: Instant ranking updates after each submission
- Detailed Score View: View individual table scores with filtering and statistics
- WebSocket Progress: Real-time progress updates during evaluation
- Multi-language Support: Switch between Traditional Chinese and English
- Admin Dashboard: Manage submissions with authentication and delete capabilities
- Format Validation: Automatic validation of uploaded JSON files
- Modern UI: Clean and responsive web interface
- Docker Support: Easy deployment with containerization
- CSV Export: Download detailed scores as CSV files
- Backend: FastAPI
- Frontend: Jinja2 Templates, HTML/CSS/JavaScript
- Real-time Communication: WebSocket
- Internationalization: Custom i18n module (Chinese/English)
- Metrics: TEDS (Tree Edit Distance), Levenshtein Distance, Edit Distance
- Parsing: lxml, apted, distance, zss
- Server: Uvicorn
- Authentication: Cookie-based session management
- Python 3.12 or higher
- pip package manager
- Clone the repository:
git clone https://github.com/wcks13589/ocr-eval-platform.git
cd ocr-eval-platform- Install dependencies:
pip install -r requirements.txt-
Prepare your ground truth data:
- Place your ground truth JSON file at
data/ground_truth.json - Format:
{"id": "<table>...</table>"}or markdown table format
- Place your ground truth JSON file at
-
Run the server:
uvicorn app.main:app --host 0.0.0.0 --port 8080- Access the platform:
- Open your browser and navigate to
http://localhost:8080
- Open your browser and navigate to
- Build the Docker image:
docker build -t ocr-eval-platform .- Run the container with volume mounting:
docker run -p 8080:8080 \
-v $(pwd)/data:/app/data \
ocr-eval-platformNote: The -v flag mounts the local data/ directory to persist uploads and leaderboard data.
- Access the platform at
http://localhost:8080
For easier management, use Docker Compose:
# Start the service
docker compose up -d
# View logs
docker compose logs -f
# Stop the service
docker compose downThe docker-compose.yml automatically handles volume mounting and configuration.
Your prediction file should be a JSON file with the following structure:
{
"sample_id_1": "| Header1 | Header2 |\n|---------|----------|\n| Cell1 | Cell2 |",
"sample_id_2": "<table><tr><td>Cell1</td><td>Cell2</td></tr></table>",
"sample_id_3": "..."
}Supported formats:
- Markdown tables
- HTML table strings
- Mixed format (different IDs can use different formats)
- Navigate to the main page
- Enter your participant name
- Upload your JSON prediction file
- Click "Start Evaluation" (π ιε§θ©δΌ°)
- Watch real-time progress updates via WebSocket
- View your score and ranking on the leaderboard
- Click "Details" to see individual table scores
After submission, you can view detailed evaluation results:
- Click the "π θ©³η΄°" (Details) button next to your name on the leaderboard
- View statistics including:
- Overall TEDS score
- Valid data count
- Score distribution (Perfect/High/Medium/Low)
- Individual table scores
- Use filters to show/hide:
- Normal data (β )
- Missing data (β)
- Error data (
β οΈ ) - Score range filtering
- Download results as CSV for further analysis
Administrators can manage submissions:
- Navigate to
/admin/login - Enter the admin password
- Access the admin dashboard to:
- View all submissions
- Delete individual entries (removes all associated data)
- Monitor platform usage
- Logout when finished to clear the session
The platform uses TEDS (Tree Edit Distance based Similarity) to evaluate table structure accuracy:
- Range: 0.0 to 1.0 (higher is better)
- Calculation: Measures structural and content similarity between predicted and ground truth tables
- Normalization: Accounts for table size differences
- Weighting: Considers both cell content and table structure
Main page with upload form and leaderboard
Upload prediction file without evaluation
- Parameters:
name(form field): Participant namefile(file upload): JSON prediction file
- Returns: JSON response with file path or error
Upload and evaluate prediction file (fallback for non-WebSocket)
- Parameters:
name(form field): Participant namefile(file upload): JSON prediction file
- Returns: Updated leaderboard with evaluation results
View standalone leaderboard page
View detailed evaluation results for a participant
- Parameters:
name(path): Participant name
- Returns: HTML page with detailed scores, statistics, and filtering options
Get detailed evaluation data in JSON format
- Parameters:
name(path): Participant name
- Returns: JSON with detailed scores and statistics
Set interface language preference
- Parameters:
lang(path): Language code (zh-TWoren)
- Returns: Redirect to previous page with language cookie set
Real-time evaluation progress updates
- Parameters:
session_id(path): Unique session identifier
- Messages:
- Receives:
{name, file_path}to start evaluation - Sends: Progress updates and completion status
- Receives:
Admin login page
Admin authentication
- Parameters:
password(form field): Admin password
- Returns: Redirect to dashboard on success
Admin control panel (requires authentication)
- Features: View all submissions, delete entries
- Authentication: Cookie-based session token
Admin logout and session cleanup
Delete a participant's data (requires admin authentication)
- Parameters:
name(path): Participant nameadmin_token(cookie): Admin session token
- Returns: JSON response with updated leaderboard
ocr-eval-platform/
βββ app/
β βββ main.py # FastAPI application and routes
β βββ evaluation.py # Evaluation logic and metrics
β βββ TEDS_metric.py # TEDS implementation
β βββ parallel.py # Parallel processing utilities
β βββ i18n.py # Internationalization (Chinese/English)
β βββ static/
β β βββ style.css # Styling
β βββ templates/
β βββ index.html # Main page with upload form
β βββ leaderboard.html # Standalone leaderboard page
β βββ details.html # Detailed score view page
β βββ admin_login.html # Admin login page
β βββ admin_dashboard.html # Admin control panel
β βββ result.html # Results display (legacy)
βββ data/ # Data directory (separate from code)
β βββ ground_truth.json # Ground truth data
β βββ leaderboard.json # Leaderboard storage (auto-generated)
β βββ details/ # Individual participant detailed scores
β βββ uploads/ # Uploaded prediction files
βββ .gitignore # Git ignore rules
βββ Dockerfile # Docker configuration
βββ docker-compose.yml # Docker Compose configuration
βββ requirements.txt # Python dependencies
βββ README.md # This file
The platform uses a dedicated data/ directory to separate data from code:
Benefits:
- β Clean separation: Code and data are isolated
- β
Easy backup: Simply backup the
data/folder - β Docker persistence: Easy volume mounting for containers
- β Version control: Data files can be gitignored separately
- β Security: Sensitive data isolated from application code
Directory structure:
data/
βββ ground_truth.json # Your test dataset (required)
βββ leaderboard.json # Auto-generated rankings
βββ details/ # Detailed scores for each participant
βββ uploads/ # User-submitted predictions
The data/ground_truth.json file should contain:
{
"sample_id_1": "| Header1 | Header2 |\n|---------|----------|\n| Cell1 | Cell2 |",
"sample_id_2": "<table><tr><td>Cell1</td><td>Cell2</td></tr></table>",
"sample_id_3": "..."
}Example with actual data:
{
"table_001": "| Name | Age | City |\n|------|-----|------|\n| Alice | 30 | NYC |",
"table_002": "<table><tr><td>Product</td><td>Price</td></tr><tr><td>Apple</td><td>$2</td></tr></table>"
}Set the admin password using an environment variable:
# Linux/Mac
export ADMIN_PASSWORD="your_secure_password"
# Windows
set ADMIN_PASSWORD=your_secure_password
# Docker
docker run -p 8080:8080 -e ADMIN_PASSWORD=your_secure_password ocr-eval-platformDefault password (if not set): admin123
Security Note: Always change the default admin password in production environments.
The platform supports:
- Traditional Chinese (
zh-TW) - Default - English (
en)
Users can switch languages using the language selector in the web interface. The preference is stored in a cookie for 1 year.
In app/evaluation.py, you can adjust:
teds = TEDS(n_jobs=4) # Number of parallel jobsThe platform handles various error cases:
- Invalid JSON format: Returns error message with parsing details and removes uploaded file
- Encoding errors: Detects non-UTF-8 files and provides helpful error messages
- Duplicate names: Prevents overwriting existing submissions with clear warning
- Missing fields: Gracefully handles incomplete predictions
- WebSocket fallback: Automatically falls back to traditional POST if WebSocket is unavailable
- Authentication errors: Redirects to login page for unauthorized admin access
- File cleanup: Automatically removes uploaded files on evaluation failure
Copyright 2020 IBM (TEDS implementation)
Licensed under Apache 2.0 License
Contributions are welcome! Please feel free to submit issues or pull requests.
For questions or issues, please contact the project maintainer or open an issue on GitHub.
- TEDS implementation based on PubTabNet by IBM Research
- Tree edit distance using apted library
- FastAPI framework for rapid web development
Version: 2.0.0
Last Updated: October 2025
- β¨ Multi-language support (Traditional Chinese and English)
- π Admin dashboard with authentication
- π Detailed score view with filtering and statistics
- π WebSocket real-time progress updates
- π₯ CSV export functionality
- ποΈ Admin delete capabilities
- π¨ Improved UI with better user experience
- π Cookie-based session management