This is the project for Machine Learning and Deep Learning course at Politecnico di Torino, 2022-2023. This project use a provided dataset called San Francisco eXtra Small and an evaluation dataset called Tokyo eXtra Small (SF_XS, go here to download both the datasets), and a highly scalable training method (called CosPlace), which allows to reach SOTA results with compact descriptors.
After downloading the SF_XS dataset, simply run
$ python3 train.py --dataset_folder path/to/sf-xs
the script automatically splits SF_XS in CosPlace Groups, and saves the resulting object in the folder cache.
By default training is performed with a ResNet-18 with descriptors dimensionality 512 is used, which fits in less than 4GB of VRAM.
To change the backbone or the output descriptors dimensionality simply run
$ python3 train.py --dataset_folder path/to/sf-xs --backbone efficientnet_v2_s --fc_output_dim 128
You can also speed up your training with Automatic Mixed Precision (note that all results/statistics from the paper did not use AMP)
$ python3 train.py --dataset_folder path/to/sf-xs --use_amp16
Run $ python3 train.py -h to have a look at all the hyperparameters that you can change. You will find all hyperparameters mentioned in the paper.
You can also try some autoaugmentations, simply running
$ python3 train.py --dataset_folder path/to/sf-xs --autoaugment_policy IMAGENET
Results from the paper are fully reproducible, and we followed deep learning's best practices (average over multiple runs for the main results, validation/early stopping and hyperparameter search on the val set). If you are a researcher comparing your work against ours, please make sure to follow these best practices and avoid picking the best model on the test set.
You can test a trained model as such
$ python3 eval.py --dataset_folder path/to/sf-xl/processed --backbone VGG16 --fc_output_dim 128 --resume_model path/to/best_model.pth
You can download plenty of trained models below.
We now have all our trained models on PyTorch Hub, so that you can use them in any codebase without cloning this repository simply like this
import torch
model = torch.hub.load("gmberton/cosplace", "get_trained_model", backbone="ResNet50", fc_output_dim=2048)
As an alternative, you can download the trained models from the table below, which provides links to models with different backbones and dimensionality of descriptors, trained on SF-XL.
| Model | Dimension of Descriptors | ||||||
|---|---|---|---|---|---|---|---|
| 32 | 64 | 128 | 256 | 512 | 1024 | 2048 | |
| ResNet-18 | link | link | link | link | link | - | - |
| ResNet-50 | link | link | link | link | link | link | link |
| ResNet-101 | link | link | link | link | link | link | link |
| ResNet-152 | link | link | link | link | link | link | link |
| VGG-16 | - | link | link | link | link | - | - |
Or you can download all models at once at this link
Parts of this repo are inspired by the following repositories:
- CosFace implementation in PyTorch
- CNN Image Retrieval in PyTorch (for the GeM layer)
- Visual Geo-localization benchmark (for the evaluation / test code)