Experiment Options #7

monigarr · 2025-07-16T03:02:19Z

monigarr
Jul 16, 2025
Maintainer

You can run a wide range of experiments using the spinup.run command wrapper. Here are the main commands and some of the most useful arguments to control your experiments.

Training an Agent

This is the main command for training an agent with a specific algorithm.

python -m spinup.run [ALGORITHM] [ARGUMENTS...]

[ALGORITHM]: The name of the reinforcement learning algorithm you want to use. Your modernized setup supports the following PyTorch versions:

ppo
vpg
ddpg
td3
sac

[ARGUMENTS...]: You can provide arguments to configure the training run. The most common ones are:

--env [ENV_NAME]: Specifies the environment to train on (e.g., LunarLander-v3, Ant-v4, Humanoid-v4).
--exp_name [NAME]: Gives your experiment a custom name.
--epochs [NUMBER]: Sets the number of training epochs.
--steps [NUMBER]: Sets the number of steps per epoch.
--seed [NUMBER]: Sets the random seed for reproducibility.

Train a Soft Actor-Critic (SAC) agent on the Ant environment for 100 epochs

python -m spinup.run sac --env Ant-v4 --epochs 100 --exp_name my_ant_sac

python -m spinup.run sac --env Ant-v5 --epochs 200 --exp_name exp_sac_antv5_july_20_2025

Testing a Trained Agent

This command loads a saved policy and runs it in the environment so you can watch it perform.

python -m spinup.run test_policy [PATH_TO_EXPERIMENT_DATA]
[PATH_TO_EXPERIMENT_DATA]: The full path to the directory where your model was saved (e.g., data/my_ant_sac/my_ant_sac_s0).

Watch the Ant agent you just trained for 5 episodes

python -m spinup.run test_policy data/my_ant_sac/my_ant_sac_s0 --episodes 5

Plotting Results

This command reads the progress.txt file from one or more experiments and generates performance graphs.

Plot a single experiment

python -m spinup.run plot [PATH_TO_EXPERIMENT_DATA]

Plot and compare multiple experiments on the same graph

python -m spinup.run plot [PATH_1] [PATH_2] ... [PATH_N]

Compare the performance of two different experiments on the same plot

python -m spinup.run plot data/my_ppo_run/my_ppo_run_s0 data/my_sac_run/my_sac_run_s0

Running Multiple Experiments (Grid Search)

This is a powerful feature for hyperparameter tuning. You can provide a list of values for any argument, and Spinning Up will automatically run an experiment for each combination.

You can simply list multiple values after an argument flag.

Let's test PPO on LunarLander-v3 with three different learning rates.

python -m spinup.run ppo --env LunarLander-v3 --pi_lr 0.001 0.0003 0.00001 --exp_name lunar_lr_search

This single command will run three full experiments, saving each in a separate subdirectory within data/lunar_lr_search. You can then use the plot command on that parent directory to see which learning rate performed best.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experiment Options #7

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Experiment Options #7

Uh oh!

Uh oh!

monigarr Jul 16, 2025 Maintainer

Training an Agent

Train a Soft Actor-Critic (SAC) agent on the Ant environment for 100 epochs

Testing a Trained Agent

Watch the Ant agent you just trained for 5 episodes

Plotting Results

Plot a single experiment

Plot and compare multiple experiments on the same graph

Compare the performance of two different experiments on the same plot

Running Multiple Experiments (Grid Search)

Replies: 0 comments

monigarr
Jul 16, 2025
Maintainer