NavOCR is an open-source project that provides a lightweight text detection model for navigation.
Other publicly available OCR models often work too well and detect texts unrelated to navigation, such as advertisements, logos, or price tags.
NavOCR detects only the text that is necessary for navigation, such as signboards, directional guides, and room numbers.
We provide the full pipeline for model training (including data crawling, dataset preprocessing, and fine-tuning).
❗This repository is currently under heavy refactoring and development. Please note that it may contain unstable components. Improvements and updates will be released soon.
Our model is included in this repo. So, clone this repo to download the model!
The current model supports detection of store signboards only.
Detection of other navigation-relevant text types will be supported in future updates.
git clone git@github.com:kc-ml2/NavOCR.gitInstall PaddleDetection following offical guide.
# Setup python env
pip install gdown==5.2.0
# Download sample testset
mkdir data && cd data
gdown https://drive.google.com/uc?id=1GcgddRm4GsjPKUOVdmWFzeF5gElCZfx2
unzip example_sequence.zip
cd .. && mkdir results# Remove visualize argument for fast inference
python run_inference.py -c configs/ppyoloe/ppyoloe_crn_s_infer_only.yml --infer_dir data/example_sequence/images --visualize TrueComing soon! (Dataset crawling, dataset preprocessing, model fine-tuning, ...)
We're working on expanding support beyond store signboards detection model. Stay tuned for upcoming features for broader navigation use cases.
- Library migration due to a license issue (
ultralytics->PaddleDetection) - Alternative inference for higher FPS (
PaddleDetectionis slow for video inference. (About 30 FPS with GPU)) - Model training scripts
- Integration with text recognition (Only detection is available now.)
- Room number and floor sign detection
- Directional guide text detection
- Integration with other SLAM packages via ROS
This repository is licensed under the Apache License, Version 2.0.
This project includes code and configuration files derived from PaddleDetection (https://github.com/PaddlePaddle/PaddleDetection), which is also licensed under the Apache License, Version 2.0.
