NavOCR

Text Detection for Navigation!

NavOCR is an open-source project that provides a lightweight text detection model for navigation.
Other publicly available OCR models often work too well and detect texts unrelated to navigation, such as advertisements, logos, or price tags. NavOCR detects only the text that is necessary for navigation, such as signboards, directional guides, and room numbers.
We provide the full pipeline for model training (including data crawling, dataset preprocessing, and fine-tuning).

❗This repository is currently under heavy refactoring and development. Please note that it may contain unstable components. Improvements and updates will be released soon.

How to Use

Download Model

Our model is included in this repo. So, clone this repo to download the model!
The current model supports detection of store signboards only. Detection of other navigation-relevant text types will be supported in future updates.

git clone git@github.com:kc-ml2/NavOCR.git

Prerequisite

Install PaddleDetection following offical guide.

Download Testset

# Setup python env
pip install gdown==5.2.0

# Download sample testset
mkdir data && cd data
gdown https://drive.google.com/uc?id=1GcgddRm4GsjPKUOVdmWFzeF5gElCZfx2
unzip example_sequence.zip 
cd .. && mkdir results

Run NavOCR!

# Remove visualize argument for fast inference
python run_inference.py   -c configs/ppyoloe/ppyoloe_crn_s_infer_only.yml   --infer_dir data/example_sequence/images --visualize True

Training Model

Coming soon! (Dataset crawling, dataset preprocessing, model fine-tuning, ...)

🚧 Planned Updates

We're working on expanding support beyond store signboards detection model. Stay tuned for upcoming features for broader navigation use cases.

Library migration due to a license issue (ultralytics -> PaddleDetection)
Alternative inference for higher FPS (PaddleDetection is slow for video inference. (About 30 FPS with GPU))
Model training scripts
Integration with text recognition (Only detection is available now.)
Room number and floor sign detection
Directional guide text detection
Integration with other SLAM packages via ROS

License

This repository is licensed under the Apache License, Version 2.0.

This project includes code and configuration files derived from PaddleDetection (https://github.com/PaddlePaddle/PaddleDetection), which is also licensed under the Apache License, Version 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
configs		configs
dataset/metadata/annotations		dataset/metadata/annotations
model		model
src		src
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pylintrc		.pylintrc
LICENSE		LICENSE
NavOCR.gif		NavOCR.gif
README.md		README.md
example.svg		example.svg
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_inference.py		run_inference.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

NavOCR

Text Detection for Navigation!

How to Use

Download Model

Prerequisite

Download Testset

Run NavOCR!

Training Model

🚧 Planned Updates

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

kc-ml2/NavOCR

Folders and files

Latest commit

History

Repository files navigation

NavOCR

Text Detection for Navigation!

How to Use

Download Model

Prerequisite

Download Testset

Run NavOCR!

Training Model

🚧 Planned Updates

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages