Rapidly Learning Soft Robot Control via Implicit Time-Stepping

Dynamic target following.

Target reaching through 3D obstacles.

This repo contains code for training various soft manipulator policies using both Elastica and DisMech simulators. A preprint containing the experiments and results can be found here.

Dependencies

To run the code in this repo, two dependencies are needed:

Agent Learning Framework (alf) for training RL policies.
DisMech for simulating soft-body dynamics.
Elastica (optional) for simulating soft-body dynamics.

To install alf, simply install the repo and pip install as follows. It's recommended to use a venv.

git clone git@github.com:HorizonRobotics/alf.git
cd alf
pip install -e .
pip install pyvista  # for visualizing 3D obstacle task

Instructions for installing DisMech can be found in the repo.

Users can also install Elastica if they wish to use it in place of DisMech.

pip install pyelastica==0.2.4

We also provide a docker file with all necessary dependencies that will install alf, py_dismech, and pyelastica packages when built.

Training policies

There are four different soft manipulator tasks that can be trained be specifying the appropriate conf file in confs/.

Following a target
Inverse Kinematics 4D (x, y, z, yaw)
2D Reach through obstacles
3D Reach through obstacles

The simulation framework can be chosen by specifying --conf_param "sim_framework='{dismech or elastica}'". Below is an example for training the first task with DisMech.

export OMP_NUM_THREADS=1
python -m alf.bin.train \
    --conf confs/follow_conf.py \
    --root_dir /path/to/root_dir \
    --conf_param "sim_framework='dismech'"

Training results can be monitored via tensorboard.

cd /path/to/root_dir
tensorboard --logdir .

To save a policy just cancel training with Ctrl+C and press Y to save a checkpoint.

By default, all tasks are trained using SAC with 500 parallel environments. Users interested in using other algorithms can take a look at confs/rl_alg_constructor.py. Note that switching algorithms may require hyperparameter tuning. Furthermore, all tasks can still be successfully trained with less environments in case 500 parallel environments causes too much contention. To modify the number of environments to use, users can add the following flag with the above train command

--conf_param create_environment.num_environments=N

For desktop CPUs, we found 100 environments to be a good compromise between experience diversity and computational cost.

Evaluating policies

After training a policy, you can run the latest checkpoint and observe a visualization via

python -m alf.bin.play \
    --root_dir /path/to/root_dir \
    --conf_param _USER.render=True

Users can also compare the sim-to-sim gap between DisMech on Elastica by evaluating a policy trained in one simulation in another.

python -m alf.bin.play \
    --conf confs/follow_conf.py \  # need to explicitly specify the conf if switching simulators
    --root_dir /path/to/dismech_policy \
    --conf_param _USER.render=True
    --conf_param "sim_framework='elastica'"

Citation

If our work has helped your research, please cite the following preprint.

@misc{choi2025rapidlylearningsoftrobot,
      title={Rapidly Learning Soft Robot Control via Implicit Time-Stepping},
      author={Andrew Choi and Dezhong Tong},
      year={2025},
      eprint={2511.06667},
      archivePrefix={arXiv},
      primaryClass={cs.RO},
      url={https://arxiv.org/abs/2511.06667},
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
confs		confs
docker		docker
environments		environments
media		media
utils		utils
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Rapidly Learning Soft Robot Control via Implicit Time-Stepping

Dependencies

Training policies

Evaluating policies

Citation

About

Uh oh!

Releases

Packages

Languages

License

QuantuMope/dismech-rl

Folders and files

Latest commit

History

Repository files navigation

Rapidly Learning Soft Robot Control via Implicit Time-Stepping

Dependencies

Training policies

Evaluating policies

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages