-
Notifications
You must be signed in to change notification settings - Fork 10
Open
Description
It seems like there's a lot of similarities between AgentRL and Slime when it comes to exposing a single API for rollout generation, deep SGLang integration, and async RL training, with the key difference being the addition of cross-policy sampling and task advantage normalization.
Is AgentRL the future of RL post-training at Z.ai and does this project replace Slime? If starting a new multi-turn agentic RL training project, should one build on top of AgentRL or slime?
Metadata
Metadata
Assignees
Labels
No labels