Skip to content

askdagger/askdagger_cliport

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ASkDAgger: Active Skill-level Data Aggregation for Interactive Imitation Learning

This repository contains the code for the CLIPort experiments from the paper ASkDAgger: Active Skill-level Data Aggregation for Interactive Imitation Learning.

Overview of ASkDAgger

Figure 1: The Active Skill-level Data Aggregation (ASkDAgger) framework consists of three main components: S-Aware Gating (SAG), Foresight Interactive Experience Replay (FIER), and Prioritized Interactive Experience Replay (PIER). In this interactive imitation learning framework, we allow the novice to say: "I plan to do this, but I am uncertain." The uncertainty gating threshold is set by SAG to track a user-specified metric: sensitivity, specificity, or minimum system success rate. Teacher feedback is obtained with FIER, enabling demonstrations through validation, relabeling, or teacher demonstrations. Lastly, PIER prioritizes replay based on novice success, uncertainty, and demonstration age.

Installation Instructions

Prerequisites: install uv

It is adviced to use uv to install the dependencies of askdagger_cliport package. Please make sure uv is installed according to the installation instructions.

Install askdagger_cliport

Clone askdagger_cliport, go to the folder and:

git clone git@github.com:askdagger/askdagger_cliport.git
cd askdagger_cliport
export ASKDAGGER_ROOT=$(pwd)
echo "export ASKDAGGER_ROOT=$(pwd)" >> ~/.bashrc

Create a virtual environment:

uv venv --python 3.10

Source the virtual environment:

source .venv/bin/activate

Install the askdagger_cliport package`:

uv pip install -e .

Download the Google objects:

./scripts/google_objects_download.sh

Validate the installation

You can validate the installation by performing interactive training with ASkDAgger with the train_interactive.py script, which requires 8GB of GPU memory:

python src/askdagger_cliport/train_interactive.py interactive_demos=3 save_every=3 disp=True exp_folder=exps_test train_interactive.batch_size=1

After training with ASkDAgger, you can evaluate the policy:

python src/askdagger_cliport/eval.py interactive_demos=3 n_demos=5 disp=True exp_folder=exps_test

The policy will probably fail as it needs more demonstrations/training steps to converge, but the code should run without any errors.

It is also possible to run a set of tests to confirm all is working well:

pytest tests

Download and plot results from paper

You can download the results from the paper as follows:

python scripts/results_download.py

Next, you can plot the results:

python figures/demo_types.py
python figures/domain_shift.py
python figures/training_scenario1.py
python figures/training_scenario2.py
python figures/training_scenario3.py
python figures/sensitivity.py
python figures/evaluation.py
python figures/real.py

The figures should appear in the figures directory.

Train the models yourself

Reproduction of results in the paper

To reproduce the results from the paper, you will need about 40GB of GPU memory. A set of batch scripts for SLURM jobs is available under scripts. These serve as templates and should be updated based on the specifications of your own system. If done properly, the models can be trained and evaluated as follows:

Run twice the following, for ASKDAGGER is True and ASKDAGGER is False:

sbatch --array=0-39 scripts/train_interactive.sh

Run twice the following, for ASKDAGGER is True and ASKDAGGER is False:

sbatch --array=0-239 scripts/eval.sh

Run twice the following, for ASKDAGGER is True and ASKDAGGER is False:

sbatch --array=0-239 scripts/eval_unseen.sh

Run thrice the following, for PIER and FIER is True, for PIER is True and FIER is False, and for PIER is False and FIER is True :

sbatch --array=0-9 scripts/train_interactive_domain_shift.sh

Interactive training of a single model

A single model can be trained with the following command:

python src/askdagger_cliport/train_interactive.py

The available arguments for training and evaluation can be found in src/askdagger_cliport/cfg. The model can be evaluated with:

python /src/askdagger_cliport/eval.py

Interactive training after BC pretraining

It is also possible to first perform offline Behavioral Cloning (BC) training and then continue with interactive training. For this example, we will do pretraining with 50 offline demos (25 train, 25 val). For this, you will first have to create demonstrations:

python src/askdagger_cliport/demos.py mode=train n=25
python src/askdagger_cliport/demos.py mode=val n=25

Next, you can perform offline BC training:

python src/askdagger_cliport/train.py train.n_demos=25 train.n_val=25 train.n_steps=200 train.save_steps=[200]

Afterwards, you can continue training interactively using ASkDAgger:

python src/askdagger_cliport/train_interactive.py train_demos=25 train_steps=200

Finally, the model can be evaluated as follows:

python src/askdagger_cliport/eval.py train_demos=25

Notebook and Colab

We have prepared a Jypyter Notebook for getting acquainted with the code. This will walk you through the interactive training procedure and visualizes the novice's actions and the demonstrations. You can open the notebook by starting Jupyter-Lab:

jupyter-lab $ASKDAGGER_ROOT

and then in Jupyter-Lab you can open askdagger_cliport.ipynb in the folder notebooks.

The notebook is also available for Colab.

Credits

This work uses code from the following open-source projects and datasets:

CLIPort

Original: https://github.com/cliport/cliport
License: Apache 2.0
Changes: The code under src is mainly based on the codebase of CLIPort. We created new files for interactive training, such as interactive_agent.py, pier.py, sag.py, train_interactive.py, uncertainty_quantification.py and train_interactive.py. In clip_lingunet_lat.py and resnet_lat.py some layers are removed to reduce the GPU memory footprint. Furthermore, we replaced the ReLU activations in CLIPort for LeakyReLU, since we experienced some problems with vanishing gradients during interactive training. For the rest, minor changes have been made to facilitate interactive training, relabeling demonstrations and to allow for prioritization with PIER.

CLIPort-Batchify

Original: https://github.com/ChenWu98/cliport-batchify
License: Apache 2.0
Changes: We implemented batch training for CLIPort following the changes in this repo.

Google Ravens (TransporterNets)

Original: https://github.com/google-research/ravens
License: Apache 2.0
Changes: We use the tasks as adapted for CLIPort to include unseen objects as distractor objects. Also created a packing-seen-shapes and packing-unseen-shapes task rather than only a packing-shapes task.

OpenAI CLIP

Original: https://github.com/openai/CLIP
License: MIT
Changes: We used CLIP as adapted for CLIPort, with minor bug fixes.

Google Scanned Objects

Original: Dataset
License: Creative Commons BY 4.0
Changes: We use the objects as adapted for CLIPort with fixed center-of-mass (COM) to be geometric-center for selected objects.

U-Net

Original: https://github.com/milesial/Pytorch-UNet/
License: GPL 3.0
Changes: Used as is in unet.py. Note: This part of the code is GPL 3.0.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published