Skip to content

Script for converting the dataset presented in the ReadingBank paper to a Hugging Face dataset

Notifications You must be signed in to change notification settings

albertklor/reading-bank

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 

Repository files navigation

Convert the ReadingBank dataset (from LayoutReader: Pre-training of Text and Layout for Reading Order Detection) into a Hugging Face Dataset with train/dev/test splits.

References:

Virtual Environment

  • macOS/Linux:

    python -m venv .venv
    source .venv/bin/activate
    
  • Windows (PowerShell):

    python -m venv .venv
    .venv\Scripts\activate
    

Install

pip install -r requirements.txt

Run

python main.py --dataset_directory /path/to/ReadingBank --hub_repo your-username/readingbank --hf_token hf_your_token_here

About

Script for converting the dataset presented in the ReadingBank paper to a Hugging Face dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages