# Ensure that the Python version is 3.10
pip install --editable ./third_party/torchquantum
pip install quarkstudio==7.0.5
pip install gymnasium[box2d]==0.29.1We offer a user-friendly Python script and accompanying configuration files to facilitate training hybrid quantum-classical models in diverse reinforcement learning environments.
python main.py <config_file_name>- Replace
<config_file_name>with the desired environment from the./configdirectory or create a custom configuration of your own.
| Parameter | Description | Example Value |
|---|---|---|
env_name |
Name of the reinforcement learning environment. | LunarLander-v2 |
n_steps |
Number of steps per environment per update. | 1024 |
mini_batch_size |
Size of the mini-batch. | 64 |
max_train_steps |
Maximum number of training steps. | 1,750,000 |
lr_a |
Learning rate for the actor network. | 0.003 |
lr_c |
Learning rate for the critic network. | 0.0003 |
gamma |
Discount factor. | 0.999 |
lamda |
GAE parameter. | 0.98 |
epsilon |
PPO clip parameter. | 0.2 |
K_epochs |
Number of PPO epochs. | 4 |
entropy_coef |
Entropy coefficient. | 0.01 |
num_envs |
Number of environments to run in parallel. | 16 |
n_blocks |
Number of blocks in the quantum reinforcement learning network. | 1 |
n_wires |
Number of qubits in the quantum circuit. | 4 |
use_quafu |
Specify whether to use Quafu quantum hardware | True |
key |
Token required for accessing Quafu cloud quantum hardware | ' ' |
Training results can be visualized using TensorBoard:
tensorboard --logdir=./runsBenchmark reinforcement learning environments have been successfully solved using PPO-Q, as illustrated in the following table and figures.
| Environment | State Space Dimension | Action Space Dimension |
|---|---|---|
| CartPole | 4 | 2 |
| MountainCar | 2 | 3 |
| Acrobot | 6 | 3 |
| LunarLander | 8 | 4 |
| MountainCar(C) | 2 | 1 |
| Pendulum | 3 | 1 |
| LunarLander(C) | 8 | 2 |
| BipedalWalker | 24 | 4 |
| CartPole | Acrobot | LunarLander |
|---|---|---|
![]() |
![]() |
![]() |
| MountainCarC | Pendulum | BipedalWalker |
|---|---|---|
![]() |
![]() |
![]() |
arxiv is coming soon!





