Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies

The repo hosts the code for the experiments section in paper "Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies" by Ilyas Fatkhullin, Anas Barakat, Anastasia Kireeva, Niao He (2023).

This code contains implementations for N-PG-IGT, (N)-HARPG, and Vanilla-PG.

Prerequisites

This code is based on the garage repository. To install the code with our algorithms implementation, navigate to the directory containing the code and install garage as an editable package:

pip install -e '.[all,dev]'

To run experiments on mujoco environments (e.g., humanoid, hopper, halfcheetah), you additionally need to install it. The installation guide for mujoco can be found in mujoco-py repository.

Running the Experiments

You can find the main experiment file run_PG.py in directory examples_PolicyGradient. You can specify various parameters to control the experiment settings.

seed: the random seed for reproducibility.
batch_size: the number of samples per batch at each iteration.
gamma_0: the initial stepsize.
method: optimization method (sgd=Vanilla SGD, nigt=N-PG-IGT, nsgdm=Normalized SGD, nstormhess=N-HARPG, stormhess=HARPG).
eta_0: the initial momentum parameter
env: the environment (walker, acrobot, cartpole, halfcheetah, hopper, humanoid, reacher, swimmer)
logdir: the path to directory for logs

Example Command

To run an experiment with a specific configuration, navigate to your project's root directory and use the following command:

python examples/run_experiments.py -seed 11 -batch_size 5000 -gamma_0 0.1 -method sgd -env cartpole -epochs 100```

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
envs		envs
experiment		experiment
np		np
plotter		plotter
replay_buffer		replay_buffer
run_experiment		run_experiment
sampler		sampler
tf		tf
torch		torch
README.md		README.md
__init__.py		__init__.py
_dtypes.py		_dtypes.py
_environment.py		_environment.py
_functions.py		_functions.py
trainer.py		trainer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies

Prerequisites

Running the Experiments

Example Command

About

Uh oh!

Releases

Packages

Languages

aabkn/Stochastic-Policy-Gradient-Methods

Folders and files

Latest commit

History

Repository files navigation

Stochastic Policy Gradient Methods: Improved Sample Complexity for Fisher-non-degenerate Policies

Prerequisites

Running the Experiments

Example Command

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages