Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation

Implementation of

Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation

This repo contains implementation of the method Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation based on NAVSIM repo.

Running the model

Getting started:
Start with looking through the Getting started section of the original NavSim repo below which can give a broader understanding of the code structure. Download and installation part is crucial for proceeding. The original repo requires the directory structure as shown below (example for mini and trainval splits) which requires rearranging after dowloading scripts.
```
download/
├── sensor_blobs/
│   ├── trainval/
│   └── mini/
├── navsim_logs/
│   ├── trainval/
│   └── mini/
└── maps/
```
Dataset preparation:
In order to train the model it is required to process the dataset to generate preprocessed targets and labels. In our implementation we pre-compute PDM metrics as targets which takes about 30 hours on 32 cores. As the task is computationally intensive it is recommended to use a separate script which benefit for parallelization
```
bash scripts/hydra_scripts/prepare_dataset_cache.sh
```
this command will produce exp/training_cache directory containing preprocessed dataset. Feel free to change TRAIN_TEST_SPLIT=navtrain within the script based on your needs.
Training model:
In order to use WandB logging, first, please set the HYDRA_WANDB_ENTITY and HYDRA_WANDB_PROJECT env variables. Then, the training could be run by
```
scripts/hydra_scripts/train_pdm_score_only.sh
```
In train_pdm_score_only.sh only the final pdm score is used as the prediction target and as a cost function for selecting the best trajectory. In contrast, another example
```
scripts/hydra_scripts/train_multiple_loss_targets.sh
```
optimizes for multiple targets. Also, agent.config.cost_function_weights can be changed for trained model during inference.
Inferencing model:
4.1 Local score computation:
First, it is required to prepare metric cache by running
```
bash scripts/evaluation/run_metric_caching.sh
```
and then computing metrics
```
bash scripts/hydra_scripts/local_evaluation.sh
```
which will produce .csv file with per-scenario and aggregated metrics.

4.2 Submission generation:
Running
```
bash scripts/hydra_scripts/prepare_submission.sh
```
will produce .pkl file that can be sent to leaderboard (currently disabled).

Preparing trajectories dictionary (optional)

Hydra-MDP selects the most optimal trajectroy from fixed trajectories dictionary based on the cost function score. For reproducibility easiness we provide precomputed dictionaries within this repo in navsim/agents/hydra/trajectory_vocab/real directory. However, the user can prepare their own dictionary with the help of navsim/agents/hydra/prepare_trajectories_bank.py script.

Results

The following models were evaluated on navhard_two_stage split (Number of successful scenarios: 5912):

metric	train_multiple_loss_targets	hydra_pdm_score_only	train_multiple_loss_targets_1024	imitation_loss_only
token	extended_pdm_score_combined	extended_pdm_score_combined	extended_pdm_score_combined	extended_pdm_score_combined
no_at_fault_collisions_stage_one	0.9544	0.96	0.97	0.9533
drivable_area_compliance_stage_one	0.9489	0.9289	0.9378	0.6978
driving_direction_compliance_stage_one	0.9933	0.9989	0.9911	0.9756
traffic_light_compliance_stage_one	1.0	0.9978	0.9978	0.9933
ego_progress_stage_one	0.8352	0.8149	0.835	0.8295
time_to_collision_within_bound_stage_one	0.9556	0.96	0.9644	0.9378
lane_keeping_stage_one	0.9489	0.9267	0.9689	0.9311
history_comfort_stage_one	0.9778	0.9756	0.9756	0.9756
pdm_score_no_masking_stage_one	0.7619	0.7442	0.7635	0.5404
pdm_score_proxy_stage_one	0.8012	0.7705	0.7994	0.5906
two_frame_extended_comfort_stage_one	0.6	0.5467	0.6044	0.6578
no_at_fault_collisions_stage_two	0.8233	0.8337	0.8115	0.7967
drivable_area_compliance_stage_two	0.8315	0.8302	0.8232	0.6397
driving_direction_compliance_stage_two	0.8783	0.8917	0.8719	0.7932
traffic_light_compliance_stage_two	0.9826	0.9787	0.9794	0.9809
ego_progress_stage_two	0.8594	0.8154	0.8509	0.8409
time_to_collision_within_bound_stage_two	0.787	0.8018	0.7791	0.7609
lane_keeping_stage_two	0.482	0.4734	0.4556	0.445
history_comfort_stage_two	0.9527	0.9609	0.9634	0.9646
two_frame_extended_comfort_stage_two	0.5686	0.5879	0.6091	0.6533
score	0.3536	0.3445	0.3454	0.1786
wandb_run	link	link	link	link
checkpoint	download	download	download	download

Team behind this solution

Yuriy Biktairov
Stepan Konev

Citation

Below is the original README of the NavSim repo:

Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking

Paper | Supplementary | Talk | 2024 Challenge | Leaderboard v2 | Warmup Leaderboard v2

NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking

Daniel Dauner^1,2, Marcel Hallgarten^1,5, Tianyu Li³, Xinshuo Weng⁴, Zhiyu Huang^4,6, Zetong Yang³
Hongyang Li³, Igor Gilitschenski^7,8, Boris Ivanovic⁴, Marco Pavone^4,9, Andreas Geiger^1,2, and Kashyap Chitta^1,2

¹University of Tübingen, ²Tübingen AI Center, ³OpenDriveLab at Shanghai AI Lab, ⁴NVIDIA Research
⁵Robert Bosch GmbH, ⁶Nanyang Technological University, ⁷University of Toronto, ⁸Vector Institute, ⁹Stanford University

Advances in Neural Information Processing Systems (NeurIPS), 2024
Track on Datasets and Benchmarks

Highlights

🔥 NAVSIM gathers simulation-based metrics (such as progress and time to collision) for end-to-end driving by unrolling simplified bird's eye view abstractions of scenes for a short simulation horizon. It operates under the condition that the policy has limited influence on the environment, which enables efficient, open-loop metric computation while being better aligned with closed-loop evaluations than traditional displacement errors.

This branch contains the code for NAVSIM v2, used in the 2025 NAVSIM challenge. For NAVSIM v1, as well as its navtest leaderboard, please check the v1.1 branch.

Getting started

(back to top)

Changelog

[2025/04/28] NAVSIM v2.2 release (official devkit version for AGC 2025)
- Release of private_test_hard dataset (see splits) for the HuggingFace NAVSIM v2 End-to-End Driving Challenge 2025 Leaderboard.
  - The submission deadline is 2025-05-11 00:00:00 UTC
  - You are limited to one upload per day on the challenge leaderboard, which should take approximately 2 hours to evaluate after a succesful submission.
- Fixed bug in openscene_meta_datas for navhard and warmup
  - ⚠️ IMPORTANT: If you used navhard_two_stage/openscene_meta_datas or warmup_two_stage/openscene_meta_datas to evaluate your model, please re-download and use the new data.
[2025/04/24] NAVSIM v2.1.2 release
- Release of navhard_two_stage dataset (see splits)
- Updated Extended Predictive Driver Model Score (EPDMS) for the Hugging Face Warmup leaderboard. See see metrics for details regarding the implementation.
[2025/04/13] NAVSIM v2.1.1 release
- Updated dataset for the warmup leaderboard with minor fixes
[2025/04/08] NAVSIM v2.1 release
- Added new dataset for the Hugging Face Warmup leaderboard (see submission)
- Introduced support for two-stage reactive traffic agents (see traffic simulation)
[2025/02/28] NAVSIM v2.0 release
- Extends the PDM Score with more metrics and penalties (see metrics)
- Adds a new two-stage pseudo closed-loop simulation (see metrics)
- Adds support for reactive traffic agent policies (see traffic simulation)
[2024/09/03] NAVSIM v1.1 release
- Leaderboard for navtest on Hugging Face
- Release of baseline checkpoints on Hugging Face
- Updated docs for submission and paper
[2024/04/21] NAVSIM v1.0 release (official devkit version for AGC 2024)
- Parallelization of metric caching / evaluation
- Adds Transfuser baseline (see agents)
- Adds standardized training and test filtered splits (see splits)
- Visualization tools (see tutorial_visualization.ipynb)
[2024/04/03] NAVSIM v0.4 release
- Support for test phase frames of competition
- Download script for trainval
- Egostatus MLP Agent and training pipeline
[2024/03/25] NAVSIM v0.3 release
- Adds code for Leaderboard submission
[2024/03/11] NAVSIM v0.2 release
- Easier installation and download
- mini and test data split integration
- Privileged Human agent
[2024/02/20] NAVSIM v0.1 release (initial demo)
- OpenScene-mini sensor blobs and annotation logs
- Naive ConstantVelocity agent

(back to top)

License and citation

All assets and code in this repository are under the Apache 2.0 license unless specified otherwise. The datasets (including nuPlan and OpenScene) inherit their own distribution licenses. Please consider citing our paper and project if they help your research.

@inproceedings{Dauner2024NEURIPS,
	author = {Daniel Dauner and Marcel Hallgarten and Tianyu Li and Xinshuo Weng and Zhiyu Huang and Zetong Yang and Hongyang Li and Igor Gilitschenski and Boris Ivanovic and Marco Pavone and Andreas Geiger and Kashyap Chitta},
	title = {NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking},
	booktitle = {Advances in Neural Information Processing Systems (NeurIPS)},
	year = {2024},
}

@misc{Contributors2024navsim,
    title={NAVSIM: Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking},
    author={NAVSIM Contributors},
    howpublished={\url{https://github.com/autonomousvision/navsim}},
    year={2024}
}

(back to top)

Other resources

SLEDGE | tuPlan garage | CARLA garage | Survey on E2EAD
PlanT | KING | TransFuser | NEAT

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 123 Commits
.github/workflows		.github/workflows
assets		assets
docs		docs
download		download
navsim		navsim
scripts		scripts
tutorial		tutorial
.flake8		.flake8
.gitignore		.gitignore
.isort.cfg		.isort.cfg
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
analyze_no_lidar.ipynb		analyze_no_lidar.ipynb
environment.yml		environment.yml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation

Running the model

Preparing trajectories dictionary (optional)

Results

Team behind this solution

Citation

Below is the original README of the NavSim repo:

Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking

Paper | Supplementary | Talk | 2024 Challenge | Leaderboard v2 | Warmup Leaderboard v2

Highlights

Table of Contents

Getting started

Changelog

License and citation

Other resources

About

Uh oh!

Releases

Packages

Languages

License

stepankonev/Hydra-MDP

Folders and files

Latest commit

History

Repository files navigation

Hydra-MDP: End-to-end Multimodal Planning with Multi-target Hydra-Distillation

Running the model

Preparing trajectories dictionary (optional)

Results

Team behind this solution

Citation

Below is the original README of the NavSim repo:

Data-Driven Non-Reactive Autonomous Vehicle Simulation and Benchmarking

Paper | Supplementary | Talk | 2024 Challenge | Leaderboard v2 | Warmup Leaderboard v2

Highlights

Table of Contents

Getting started

Changelog

License and citation

Other resources

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages