Skip to content

ucla-mobility/MDriveBench

Repository files navigation

MDriveBench: MDriveBench: Multi-Agent Multi-Granular Driving Benchmark

TODO: Add project overview


Repository Structure and Scope

This repository implements MDriveBench, a multi-agent driving benchmark.
It was originally built on top of CoLMDriver; CoLMDriver is now one of multiple models in the repository, and the benchmark infrastructure has been built on top of it.

MDriveBench provides:

  • Benchmark infrastructure (CARLA integration, scenarios, evaluation, analysis)
  • Multiple baseline and LLM-based driving models (TCP, CoDriving, LMDrive, UniAD, CoLMDriver, and VAD)
  • Training code for CoLMDriver components

Table of Contents


Global Setup

General Setup

Two environments are needed: 'vllm' for MLLMs inference and 'colmdriver' for simulation.

vLLM env

Step 1: Install conda (if not installed already)

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

Step 2: Environment Creation and VLLM download

conda create -n vllm python=3.10
conda activate vllm
pip install vllm

CoLMDriver env

Step 1: Basic Installation for colmdriver

Get code and create pytorch environment.

git clone https://github.com/marco-cos/CoLMDriver.git
cd CoLMDriver

conda create --name colmdriver python=3.7 cmake=3.22.1
conda activate colmdriver
conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
conda install cudnn -c conda-forge

pip install -r opencood/requirements.txt
pip install -r simulation/requirements.txt
pip install openai

Step 2: Download and setup CARLA 0.9.10.1.

chmod +x simulation/setup_carla.sh
./simulation/setup_carla.sh
easy_install carla/PythonAPI/carla/dist/carla-0.9.10-py3.7-linux-x86_64.egg
mkdir external_paths
ln -s ${PWD}/carla/ external_paths/carla_root
# If you already have a Carla, just create a soft link to external_paths/carla_root

The file structure should be:

|--CoLMDriver
    |--external_paths
        |--carla_root
            |--CarlaUE4
            |--Co-Simulation
            |--Engine
            |--HDMaps
            |--Import
            |--PythonAPI
            |--Tools
            |--CarlaUE4.sh
            ...

Note: we choose the setuptools==41 to install because this version has the feature easy_install. After installing the carla.egg you can install the lastest setuptools to avoid No module named distutils_hack.

Steps 3,4,5 are for perception module.

Step 3: Install Spconv (1.2.1)

We use spconv 1.2.1 to generate voxel features in perception module.

To install spconv 1.2.1, please follow the guide in https://github.com/traveller59/spconv/tree/v1.2.1.

Or run the following commands:

# 1. Activate your environment (if not activated already)
conda activate colmdriver

# 2. Install dependencies (these are all user-space)
conda install -y cmake=3.22.1 ninja boost ccache -c conda-forge
pip install pybind11 numpy

# 3. Clone spconv recursively (submodules required!)
git clone -b v1.2.1 --recursive https://github.com/traveller59/spconv.git
cd spconv

# 4. Build the wheel (will compile in your conda CUDA toolchain)
python setup.py bdist_wheel

# 5. Install the resulting .whl (no sudo needed)
pip install dist/spconv-1.2.1-*.whl

cd ..

Step 4: Set up

# Set up
python setup.py develop

# Bbx IOU cuda version compile
python opencood/utils/setup.py build_ext --inplace 

Step 5: Install pypcd

# go to another folder
cd ..
git clone https://github.com/klintan/pypcd.git
cd pypcd
pip install python-lzf
python setup.py install
cd ..

Baseline Evaluation Setup

Evaluation of baselines

Setup and get ckpts.

Methods TCP CoDriving
Installation Guide github github
Checkpoints google drive google drive

The downloaded checkpoints should follow this structure:

|--CoLMDriver
    |--ckpt
        |--codriving
            |--perception
            |--planning
        |--TCP
            |--new.ckpt

TCP Environment Setup

  1. Create TCP conda environment
cd CoLMDriver
conda env create -f model_envs/tcp_codriving.yaml -n tcp_codriving
conda activate tcp_codriving
  1. Set CARLA path environment variables
export CARLA_ROOT=PATHTOYOURREPOROOT/CoLMDriver/external_paths/carla_root
export PYTHONPATH=$CARLA_ROOT/PythonAPI:$CARLA_ROOT/PythonAPI/carla:$CARLA_ROOT/PythonAPI/carla/dist/carla-0.9.10-py3.7-linux-x86_64.egg

CoDriving Environment Setup

  1. Create CoDriving conda environment
cd CoLMDriver
conda env create -f model_envs/tcp_codriving.yaml -n tcp_codriving
conda activate tcp_codriving
  1. Set CARLA path environment variables
export CARLA_ROOT=PATHTOYOURREPOROOT/CoLMDriver/external_paths/carla_root
export PYTHONPATH=$CARLA_ROOT/PythonAPI:$CARLA_ROOT/PythonAPI/carla:$CARLA_ROOT/PythonAPI/carla/dist/carla-0.9.10-py3.7-linux-x86_64.egg

LMDrive Environment Setup

  1. Clone LMDrive into the assets directory
git clone https://github.com/opendilab/LMDrive simulation/assets/LMDrive
  1. Prepare LMDrive checkpoints
cd simulation/assets/LMDrive
mkdir -p ckpt

Download and place the following into simulation/assets/LMDrive/ckpt:

Download and place the following into CoLMDriver/ckpt/llava-v1.5-7b:

  1. Create environment and install dependencies
cd CoLMDriver
conda env create -f model_envs/lmdrive.yaml -n lmdrive
conda activate lmdrive

pip install carla-birdeye-view==1.1.1 --no-deps
pip install -e simulation/assets/LMDrive/vision_encoder
  1. Set CARLA path environment variables
export CARLA_ROOT=PATHTOYOURREPOROOT/CoLMDriver/external_paths/carla_root
export PYTHONPATH=$CARLA_ROOT/PythonAPI:$CARLA_ROOT/PythonAPI/carla:$CARLA_ROOT/PythonAPI/carla/dist/carla-0.9.10-py3.7-linux-x86_64.egg

UniAD Environment Setup

UniAD is a unified perception–prediction–planning autonomous driving model.
We evaluate it on the InterDrive benchmark using its official pretrained weights and a standardized conda environment to avoid dependency conflicts.

To ensure consistent and reproducible evaluation of the UniAD baseline model, we standardize the environment setup using a pre-built conda environment. This avoids dependency conflicts and ensures that anyone can run UniAD without rebuilding environments from scratch.

The YAML file for the UniAD environment is located in:

model_envs/uniad_env.yaml

To create and activate the environment:

conda env create -f model_envs/uniad_env.yaml -n uniad_env
conda activate uniad_env

UniAD runs inside the uniad_env conda environment, which contains all required CUDA, PyTorch, CARLA, and UniAD dependencies.

Additional Files

Create a ckpt/UniAD directory if it does not exist: mkdir -p CoLMDriver/ckpt/UniAD

Download the UniAD checkpoint from https://huggingface.co/rethinklab/Bench2DriveZoo/blob/main/uniad_base_b2d.pth and place it here:

CoLMDriver/ckpt/UniAD/uniad_base_b2d.pth

Download the UniAD config file from https://github.com/Thinklab-SJTU/Bench2DriveZoo/blob/uniad/vad/adzoo/uniad/configs/stage2_e2e/base_e2e_b2d.py and place it in:

simulation/assets/UniAD/base_e2e_b2d.py

VAD Environment Setup

The YAML file for the VAD environment is located in:

model_envs/vad_env.yaml

  1. Create VAD conda environment
cd CoLMDriver
conda env create -f model_envs/vad_env.yaml -n vad
conda activate vad

2. Start a Carla Instance

CUDA_VISIBLE_DEVICES=0 ./external_paths/carla_root/CarlaUE4.sh --world-port=2000 -prefer-nvidia
  1. Run VAD on Interdrive
# CARLA must already be running on port 2000
bash scripts/eval/eval_mode.sh 0 2000 vad ideal Interdrive_all

CoLMDriver Model Setup

Step 1: Download checkpoints from Google drive. The downloaded checkpoints of CoLMDriver should follow this structure:

|--CoLMDriver
    |--ckpt
        |--colmdriver
            |--LLM
            |--perception
            |--VLM
            |--waypoints_planner

To download the checkpoints through command line and move them into the correct directories (no GUI required):

#In CoLMDriver repostiory directory, with colmdriver conda env activated
pip install gdown
gdown 1z3poGdoomhujCNQtoQ80-BCO34GTOLb-

mkdir ckpt
mv colmdriver.zip ckpt
cd ckpt
unzip colmdriver.zip
rm colmdriver.zip

#Fix obsolete dataset dependancy bug
sed -i "s|root_dir: .*|root_dir: $(pwd)|; s|test_dir: .*|test_dir: $(pwd)|; s|validate_dir: .*|validate_dir: $(pwd)|" colmdriver/percpetion/config.yaml
touch dataset_index.txt

Step 2: Running VLM, LLM (from repository root)

#Enter conda ENV
conda activate vllm
# VLM on call
CUDA_VISIBLE_DEVICES=6 vllm serve ckpt/colmdriver/VLM --port 1111 --max-model-len 8192 --trust-remote-code --enable-prefix-caching

# LLM on call (in new terminal, with vllm env activated)
CUDA_VISIBLE_DEVICES=7 vllm serve ckpt/colmdriver/LLM --port 8888 --max-model-len 4096 --trust-remote-code --enable-prefix-caching

Make sure that the CUDA_VISIBLE_DEVICES variable is set to a GPU available, which can be checked using the nvidia-smi command

Note: make sure that the selected ports (1111,8888) are not occupied by other services. If you use other ports, please modify values of key 'comm_client' and 'vlm_client' in simulation/leaderboard/team_code/agent_config/colmdriver_config.yaml accordingly.


Benchmark Evaluation on InterDrive

All models are evaluated on the InterDrive benchmark using a unified interface:

bash scripts/eval/eval_mode.sh <GPU_ID> <CARLA_PORT> <MODEL_NAME> <MODE> <SCENARIO_SET>

Where:

  • <MODEL_NAME>{ colmdriver, tcp, codriving, lmdrive, uniad, vad }
  • <MODE>{ ideal, realtime } (where supported)
  • <SCENARIO_SET>{ Interdrive_all, Interdrive_no_npc, Interdrive_npc }

Make sure you have:

  • The corresponding conda environment activated for each model (e.g., tcp_codriving, lmdrive, uniad_env, colmdriver, etc.)
  • Any model-specific services running (e.g., VLM/LLM servers for CoLMDriver)

Start CARLA

# Start CARLA server; change port if 2000 is already in use
CUDA_VISIBLE_DEVICES=0 ./external_paths/carla_root/CarlaUE4.sh --world-port=2000 -prefer-nvidia

If CARLA segfaults on startup, try:

conda install -c conda-forge libglvnd mesa-libgl-devel libegl libxrender libxext libxi

Example evaluation commands

# TCP, full InterDrive
bash scripts/eval/eval_mode.sh 0 2000 tcp ideal Interdrive_all

# CoDriving, full InterDrive
bash scripts/eval/eval_mode.sh 0 2000 codriving ideal Interdrive_all

# LMDrive, full InterDrive
bash scripts/eval/eval_mode.sh 0 2000 lmdrive ideal Interdrive_all

# UniAD, full InterDrive
bash scripts/eval/eval_mode.sh 0 2000 uniad ideal Interdrive_all

# CoLMDriver: full benchmark, realtime mode, and subsets
bash scripts/eval/eval_mode.sh 0 2000 colmdriver ideal Interdrive_all
bash scripts/eval/eval_mode.sh 0 2000 colmdriver realtime Interdrive_all
bash scripts/eval/eval_mode.sh 0 2000 colmdriver ideal Interdrive_no_npc
bash scripts/eval/eval_mode.sh 0 2000 colmdriver ideal Interdrive_npc

Evaluation results are saved under:

results/results_driving_<MODEL_NAME>

For example:

  • results/results_driving_colmdriver
  • results/results_driving_tcp
  • results/results_driving_lmdrive

It’s recommended to run the LLM server, VLM server, CARLA server, and evaluation script in separate terminals. CARLA processes may fail to stop cleanly; kill them manually if needed.


LLM-Driven Scenario Generation

TODO: Add documentation for LLM-driven scenario generation, including:

  • Natural-language specification of driving scenarios
  • Conversion from language to executable scenarios
  • Support for multi-agent negotiation and coordination behaviors

Results Analysis and Visualization

Results Analysis

The repository includes a comprehensive results analysis script that generates detailed reports, visualizations, and statistics about driving performance and negotiation behavior.

Basic Usage

# Basic analysis of results directory
python visualization/results_analysis.py results/results_driving_colmdriver --output-dir report

# Multiple experiment folders
python visualization/results_analysis.py results/results_driving_colmdriver exp1 exp2 --output-dir report

# Generate single markdown report for multiple experiments
python visualization/results_analysis.py results/results_driving_colmdriver exp1 exp2 --output-dir report --markdown report/combined.md

Generated Analysis

The script generates:

  • Markdown Report: Comprehensive analysis with embedded figures
  • CSV Data Tables:
    • Per-route summary
    • Category summaries
    • Negotiation statistics (scenario/agent/setting breakdowns)
    • Infractions breakdown
  • Visualizations:
    • Driving scores by scenario category
    • Success rates across traffic conditions
    • NPC impact analysis
    • Negotiation frequency, rounds, and message-length distributions
    • Agent count distribution
    • Score distributions
    • Infractions breakdown
  • Artifacts:
    • Text report summarizing negotiation behavior
    • Collected nego.json files copied into the output directory for easy sharing

Key Metrics Analyzed

  • Driving Score (DS) and Success Rate
  • Route Categories (IC/LM/LC) performance
  • Impact of NPC traffic
  • Negotiation behavior:
    • Frequency
    • Number of rounds
    • Consensus scores
    • Safety scores
  • Agent counts and interactions
  • Infractions breakdown

The analysis helps understand:

  • How different traffic conditions affect performance
  • Which scenarios trigger most negotiations
  • How negotiation patterns vary across scenarios
  • Where driving performance needs improvement

Visualizing Results

The repository provides tools to generate videos from evaluation results:

# Generate video for a specific scenario
python visualization/gen_video.py path/to/scenario/folder --output scenario.mp4

# Options:
--fps VALUE           Set video framerate (default: 10)
--width VALUE         Set output width in pixels
--height VALUE        Set output height in pixels
--font-scale VALUE    Adjust text overlay size
--min-hold VALUE     Minimum seconds to show overlay text

# Examples:
# Basic video with default settings
python visualization/gen_video.py results/results_driving_colmdriver/route_00/0000 --output route00_test.mp4

# High quality render with custom settings
python visualization/gen_video.py results/results_driving_colmdriver/route_00/0000 \
    --output route00_hq.mp4 \
    --fps 30 \
    --width 1920 \
    --height 1080 \
    --font-scale 1.2

# Process multiple scenarios
python visualization/gen_video.py results/results_driving_colmdriver/route_*/0000 \
    --output-dir videos/

Features:

  • Multi-vehicle perspective rendering
  • Negotiation overlay visualization
  • Configurable resolution and framerate
  • Automatic scenario discovery
  • Progress tracking
  • Font size and text display customization

Dataset

The dataset for training CoLMDriver is obtained from V2Xverse, which contains experts behaviors in CARLA. You may get the dataset in two ways:

The dataset should be linked/stored under external_paths/data_root/ follow this structure:

|--data_root
    |--weather-0
        |--data
            |--routes_town{town_id}_{route_id}_w{weather_id}_{datetime}
                |--ego_vehicle_{vehicle_id}
                    |--2d_bbs_{direction}
                    |--3d_bbs
                    |--actors_data
                    |--affordances
                    |--bev_visibility
                    |--birdview
                    |--depth_{direction}
                    |--env_actors_data
                    |--lidar
                    |--lidar_semantic_front
                    |--measurements
                    |--rgb_{direction}
                    |--seg_{direction}
                    |--topdown
                |--rsu_{vehicle_id}
                |--log
            ...

Training

Perception module

Our perception module follows CoDriving. To train perception module from scratch or a continued checkpoint, run the following commonds:

# Single GPU training
python opencood/tools/train.py -y opencood/hypes_yaml/v2xverse/colmdriver_multiclass_config.yaml [--model_dir ${CHECKPOINT_FOLDER}]

# DDP training
CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch  --nproc_per_node=2 --use_env opencood/tools/train_ddp.py -y opencood/hypes_yaml/v2xverse/colmdriver_multiclass_config.yaml [--model_dir ${CHECKPOINT_FOLDER}]

# Offline testing of perception
python opencood/tools/inference_multiclass.py --model_dir ${CHECKPOINT_FOLDER}

The training outputs can be found at opencood/logs. Arguments Explanation:

  • model_dir (optional) : the path of the checkpoints. This is used to fine-tune or continue-training. When the model_dir is given, the trainer will discard the hypes_yaml and load the config.yaml in the checkpoint folder. In this case, ${CONFIG_FILE} can be None,
  • --nproc_per_node indicate the GPU number you will use.

Planning module

Given a checkpoint of perception module, we freeze its parameters and train the down-stream planning module in an end-to-end paradigm. The planner gets BEV perception feature and occupancy map as input and targets to predict the future waypoints of ego vehicle.

Train the planning module with a given perception checkpoint on multiple GPUs:

# Train planner
bash scripts/train/train_planner_e2e.sh $GPU_ids $num_GPUs $perception_ckpt $planner_config $planner_ckpt_resume $name_of_log $save_path

# Example
bash scripts/train/train_planner_e2e.sh 0,1 2 ckpt/colmdriver/percpetion covlm_cmd_extend_adaptive_20 None log ./ckpt/colmdriver_planner

# Offline test
bash scripts/eval/eval_planner_e2e.sh 0,1 ckpt/colmdriver/percpetion covlm_cmd_extend_adaptive_20 ckpt/colmdriver/waypoints_planner/epoch_26.ckpt ./ckpt/colmdriver_planner

VLM planner

Data generation

Our training data is also provided in google drive for reference. Since the images are originated from local V2Xverse dataset, you still need to download the dataset to get full access.

Lora Finetuning

Using ms-swift to finetune the MLLMs. Installation and details refer to the official repo. We provide an example script in MLLMs/finetune.sh


Acknowledgements

This implementation is based on code from several repositories.

About

MDriveBench: Multi-Agent Multi-Granular Driving Benchmark

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published