Skip to content

rusenburn/Axel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Axel

Introduction

Axel includes implementations of modern machine learning algorithms which can learn to play gym-like environments

Requirements

  • python 3.10.8 environment.
  • git

Getting started

Content

Proximal Policy Optimization (PPO)

ppo have some of the benefits of trust region policy optimization (TRPO), but they are much simpler to implement, more general, and have better sample complexity.

Policy Optimization with Penalized Point Probability Distance (POP3D)

POP3D which is a lower bound to the square of total variance divergence is proposed as another powerful variant of TRPO, Simulation results show that POP3D is highly competitive compared with PPO..

Phasic Policy Gradient (PPG)

PPG a reinforcement learning framework which modifies traditional on-policy actor-critic methods by separating policy and value function training into distinct phases, PPG significantly improves sample efficiency compared to PPO

TODO

  • Muzero
  • DeepQ Algorithms

Known Issues

Due to working alone on this project with a limited resources, It works fine on my gpu but there were no tests regarding old gpus or high-end gpus, but I will be trying to make it run on cpu incase there was no gpu, which is not supported atm.

About

Implementations of modern machine-learning papers , including PPO ,PPG and POP3D

Topics

Resources

License

Stars

Watchers

Forks

Languages