pix2struct-docvqa-base

Instructions

Following have been tested on Google Colab

Create config.toml file with following content

[MODEL]
ORGANIZATION = "google"
MODEL_NAME = "pix2struct-docvqa-base"
MODELS_DIR = "models"

Install the requirements (Better to run in a virtual environment!)
```
pip install -r requirements.txt
```
Download and convert HF model to ONNX with quantization
```
python convert.py
```

Run the inference

Available Model Type:

available_models = {
    "HF_MODEL": Pix2StructHF,
    "ONNX_MODEL": Pix2StructOnnxWithoutPast,
    "ONNX_MODEL_WITH_PAST": Pix2StructOnnxWithPast,
}

python inference.py \
    --m <MODEL_TYPE> \
    --i <PATH_TO_IMAGE_FILE> \
    --q <QUESTION> \
    --quantize [True/False (Default: False)]

See benchmarking results results.md

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
data		data
utils		utils
.gitignore		.gitignore
README.md		README.md
benchmark.py		benchmark.py
convert.py		convert.py
inference.py		inference.py
requirements.txt		requirements.txt
results.md		results.md