Difference with the Slime project

It seems like there's a lot of similarities between AgentRL and Slime when it comes to exposing a single API for rollout generation, deep SGLang integration, and async RL training, with the key difference being the addition of cross-policy sampling and task advantage normalization.

Is AgentRL the future of RL post-training at Z.ai and does this project replace Slime? If starting a new multi-turn agentic RL training project, should one build on top of AgentRL or slime?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference with the Slime project #4

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Difference with the Slime project #4

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions