This repository contains the code used in this paper to generate scenes for trajectory forecasting.
Amelia: A Large Dataset and Model for Airport Surface Movement Forecasting [paper]
Ingrid Navarro *, Pablo Ortega-Kral *, Jay Patrikar *, Haichuan Wang, Zelin Ye, Jong Hoon Park, Jean Oh and Sebastian Scherer
AmeliaScenes: Tool for generating airport surface movement scenes from raw trajectory data collected with AmeliaSWIM.
AmeliaScenes also provides scene- and per-agent-characterization tools as meta information for each agent's kinematic and interactive profile.
Finally, AmeliaScenes also provides a dataset splitting script with various train/val/test splitting strategies.
To run this repository, you first need to download the Amelia dataset from AmeliaCMU. Currently, there are two versions of the dataset available:
- Amelia-10: which contains 1 month of data for each of 10 airports.
- Amelia42-Mini: which contains 15 days of data for each of 42 airports.
To download the dataset, install Git LFS (here) and then, run the following command:
git lfs install
git clone https://huggingface.co/datasets/AmeliaCMU/Amelia-10Once downloaded, create a symbolic link in datasets from the AmeliaScenes repository:
cd datasets
ln -s /path/to/Amelia-10/data ameliaThe resulting structure should look like this:
|-- AmeliaScenes
|-- datasets
|-- amelia
|-- traj_data_a10v08
|-- raw_trajectories
|-- kbos
|-- kdca
|-- kewr
|-- kjfk
|-- klax
|-- kmdw
|-- kmsy
|-- ksea
|-- ksfo
|-- pancFor more details about the dataset, see here.
Make sure that you have conda installed.
Recommended: Use the install.sh to download and install the Amelia Framework:
chmod +x install.sh
./install.sh ameliaThis will create a conda environment named amelia and install all dependencies.
Or, refer to INSTALL.md for manual installation.
Activate your Amelia environment (Please follow the installation instructions above first.):
conda activate ameliaOnce the tools and environment are set up, process the data and generate the '.pkl' files:
cd amelia_scenes
python amelia_scenes/run_processor.py presets=amelia_10This script will process all the raw trajectory CSV files found in datasets/amelia/traj_data_a10v08/raw_trajectories/<airport_icao>/ and generate scenes for each of the airports kbos, kdca, kewr, kjfk, klax, kmdw, kmsy, ksea, ksfo, and panc ICAO.
Additional parameters can also be changed on the configuration file found in amelia_scenes/configs/processor/amelia_10.yaml. For example:
<to_process>: What to process. By default is set toboth. Possible options are:scenes: only generate scenes from the raw filesmetas: generates meta information from already generated scenes. It uses scene scoring tools.both: generates scenes and meta information, simultaneously.
<base_dir>: Path to the dataset. By default, the path is set to../datasets/amelia.<traj_version>: Version of the trajectory data. By default, it is set toa10v08.<graph_version>: Version of the graph data. By default, it is set toa10v01os.<overwrite>: If the processing should overwrite the existing data. By default, it is set toTrue.<perc_process>: Top limit visualization of the data being processed. By default, it is set to1.0.<seed>: Seed for the random number generator. By default, it is set to42.<jobs>: Number of Python worker processes to be used in parallel. By default, it is set to-1, which will use all available CPUs.
The scene processor should generate scene files for a given CSV file into datasets/amelia/traj_data_{version}/proc_trajectories/{airport_icao}/{raw_file_tagname}. Each scene file is a pickle file following the format scene_id.pkl.
For example if the input file is KBOS_1_1672531200.csv, found in:
|-- datasets
|-- amelia
|-- traj_data_a10v08
| -- raw_trajectories
| -- kbos
| -- KBOS_1_1672531200.csv
| -- other airportsThe output scenes will be in:
|-- datasets
|-- amelia
|-- traj_data_a10v08
| -- raw_trajectories
| -- proc_scenes
| -- kbos
| -- KBOS_1_1672531200
| -- 00000.pkl
...
| -- xxxxx.pklOnce the scenes are generated, run run_create_splits.py to split the dataset. The script can be run as follows:
cd amelia_scenes
python run_create_splits.py --split_type <random | day | month> --airport <airport_icao>Where:
<split_type>: Type of split to be generated. By default, it is set torandom. Possible options are:random: Randomly splits the dataset intotrain/val/testsets.day: Daily splits the dataset intotrain/val/testsets.month: Monthly splits the dataset intotrain/val/testsets.
<airport_icao>: ICAO code of the airport. By default, it is set toall.
Additional parameters can also be specified:
cd amelia_scenes
python run_create_splits.py --split_type <random | day | month> \
--base_dir <path_to_dataset> \
--seed <seed> \
--traj_version <version> \
--airport <airport_icao><base_dir>: Path to the dataset. By default, the path is set to../datasets/amelia.<traj_version>: Version of the trajectory data. By defaul,t it is set toa10v08to match the current released version.<seed>: Seed for the random number generator. By default, it is set to42.<airport>: ICAO code of the airport. By default, it is set toall.
For the kbos generated scene and the argument --split_type as random, the script should generate the following files:
|-- datasets
|-- amelia
|-- traj_data_a10v08
|-- raw_trajectories
|-- proc_scenes
|-- splits
|-- train_splits
|-- kbos_random.txt
|-- val_splits
|-- kbos_random.txt
|-- test_splits
|-- kbos_random.txtIf you find our work useful in your research, please cite us!
@inbook{navarro2024amelia,
author = {Ingrid Navarro and Pablo Ortega and Jay Patrikar and Haichuan Wang and Zelin Ye and Jong Hoon Park and Jean Oh and Sebastian Scherer},
title = {AmeliaTF: A Large Model and Dataset for Airport Surface Movement Forecasting},
booktitle = {AIAA AVIATION FORUM AND ASCEND 2024},
chapter = {},
pages = {},
doi = {10.2514/6.2024-4251},
URL = {https://arc.aiaa.org/doi/abs/10.2514/6.2024-4251},
eprint = {https://arc.aiaa.org/doi/pdf/10.2514/6.2024-4251},
}
