Skip to content

The official repository for the CodeGym project: "Generalizable End-to-End Tool-Use RL with Synthetic CodeGym"

License

Notifications You must be signed in to change notification settings

StigLidu/CodeGym

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CodeGym Logo

Generalizable End-to-End Tool-Use RL with Synthetic CodeGym

License: CC BY-NC 4.0 arXiv Hugging Face Dataset

Weihua Du, Hailei Gong, Zhan Ling, Kang Liu, Lingfeng Shen, Xuesong Yao, Yufei Xu, Dingyuan Shi, Yiming Yang, Jiecao Chen
"Generalizable End-to-End Tool-Use RL with Synthetic CodeGym" (2025)

CodeGym is a synthetic environment generation framework for LLM agent reinforcement learning on multi-turn tool-use tasks. It automatically converts static code problems into interactive CodeGym environments where agents can learn to use tools to solve complex tasks in various configurations.

Key Components

We are open-sourcing the following key parts of the project:

  • CodeGym environment synthesis pipeline: refer to gym/README.md for details.
  • Server for launching CodeGym environments aimed at large-scale reinforcement learning: refer to online_server/README.md for details.

A community reproduction of the synthetic dataset is available at HuggingFace.

Overview

CodeGym Logo

CodeGym transforms traditional code problems into interactive environments where LLM agents can learn to:

  • Use tools and actions to solve problems step-by-step
  • Learn generalizable tool-use behaviors

Environment Synthesis Process

CodeGym Logo

We designed an elaborate process for CodeGym environment synthesis and verification:

Gym Synthesis:

  • Extract reusable code logic and functions from programming solutions
  • Convert them into a library of documented tools and utilities
  • Generate OpenAI Gym format environments with state, actions, transitions, and rewards

Gym Verification:

  • Generate comprehensive unit tests spanning multiple difficulty levels
  • Validate environment correctness (no compilation errors, timeouts, or memory issues)
  • Verify solvability by generating solution functions that successfully use the provided tools

Examples

The example/ folder contains sample CodeGym environments to help you get started:

  • example/example_envs contains some CodeGym environments examples
  • example/training_instance.jsonl contains some instances for RL training
  • example/raw_problems.jsonl contains some raw coding problems for generation pipeline demonstration

Key Result

By training in CodeGym, LLMs show stronger generalization on out-of-distribution (OOD) tool-use and multi-turn benchmarks:

CodeGym Logo

CodeGym Synthesis Pipeline

We release the pipeline for environment synthesis and verification. Please refer to gym/README.md for details.

Server for CodeGym Environments

We release a highly concurrent server for launching CodeGym environments aimed at large-scale reinforcement learning. Please refer to online_server/README.md for details.

License

This project and dataset are released under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0) license.

Citation

If you find this work useful, please cite our paper:

@article{du2025generalizable,
  title={Generalizable End-to-End Tool-Use RL with Synthetic CodeGym},
  author={Du, Weihua and Gong, Hailei and Ling, Zhan and Liu, Kang and Shen, Lingfeng and Yao, Xuesong and Xu, Yufei and Shi, Dingyuan and Yang, Yiming and Chen, Jiecao},
  journal={arXiv preprint arXiv:2509.17325},
  year={2025}
}

About

The official repository for the CodeGym project: "Generalizable End-to-End Tool-Use RL with Synthetic CodeGym"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages