Skip to content

Releases: alibaba/ROLL

v0.2.0 release

04 Feb 09:04

Choose a tag to compare

Hello everyone! Thank you for your attention to ROLL.
ROLL has recently updated with a large number of new features. Below is a summary of recent updates, and we will continue to iterate and update ROLL. Welcome to join the ROLL community.

🚀 Highlights:

  • New model support: Qwen3-VL, Qwen3-MoE-VL, Qwen3-Omni, GLM-4.7
  • Agentic training and Rollout GPU partial overlap, switching idle training GPUs to Rollout
  • DynamicSamplingScheduler coroutine refactoring
  • New: FSDP2 Strategy
  • Training supports Sequence packing and Dynamic batching

🚀 Major New Features:

  • Rollout
    • DynamicSamplingScheduler coroutine refactoring
    • Custom rollout pre/post process, supporting dynamic sampling params, multi-stage generation, ThinkingBudget control
    • Sglang: Strategy refactoring, supporting server mode, native onload/offload, inflight FP8 quant rollout, cross-machine multi-node deployment
    • vLLM: DP/EP support, supports vllm==0.12.0
    • Provides AgentNative Rollout paradigm, AgentNativeStepEnvManager + SokobanNativeEnv, fully managed context by env
    • Async Rollout Hang Detect: Added asynchronous Rollout hang detection to quickly locate problematic envs
    • Supports rollout dump & mock, improving forward/train phase precision alignment efficiency
    • Agentic pipeline supports train-val/rollout overlap
  • Training
  • Model Update implementation optimization: Eliminate inter-machine redundancy, weight conversion and nccl broadcast overlap, optimize host to device, adjust multiple pp serial synchronization to lock mode for simultaneous synchronization
  • Asynchronous Feature
    • Training and Rollout GPU partial overlap, switching idle training GPUs to Rollout, report: https://arxiv.org/abs/2512.24873
    • Agentic off policy loss with IS correction
  • Pipeline recipe
    • VLM image tool use: DeepEyes, tool invocation and reward calculation overlap
  • Models: New model support for Qwen3-VL, Qwen3-MoE-VL, Qwen3-Omni-Thinker, GLM-4.7

release flag for v0.1.3

08 Dec 08:37

Choose a tag to compare

🚀亮点:

  • (feat): support Qwen3VL, mcore_adapter and examples.
  • (feat): Add optimization for computing ref_logprobs and old_logprobs.
  • (feat): support vllm beam_search.
  • (feat): Add support for Qwen-3-next on AMD GPUs.
  • (feat): support sglang==0.5.4、vllm==0.11.1、torch2.8.0.

🚀主要新特性:

  • Agentic
    • (fix): fix agentic val get_batch state in redundancy env.
    • (feat): agentic-spec actor worker.
    • (feat): add infer_log_probs in agentic.
    • (feat): refactor agentic norm like LitePPO.
    • (feat): add agentic profile metrics.
  • 模型与后端
    • (feat): support vllm beam_search.
    • (feat): Add support for Qwen-3-next on AMD GPUs.
    • (feat): support offload nccl to save gpu memory. Thanks for slime.
    • (feat): support sglang 054.
    • (feat): sglang support dp-attention.
    • (feat): add enable_reference option. #250
    • (feat): add enable_old_logprobs, opt old log probs by cache.
    • (feat): support Qwen3VL, mcore_adapter and examples yaml. #190
    • (feat): add sequence packing for sft pipeline and distill pipeline, optimize memory usage during top-k logits computation.
  • bug fix, refactor
    • (fix): update math rule reward worker with thinking. #281
    • (feat): set RAY_CGRAPH_get_timeout=600.
    • (fix): fix train infer ratio/diff mean & add train infer ratio/diff token/seq mask & add rollout importance sampling. #242 #273
    • (fix): ensure compatibility with transformers version check for causal mask update.
    • (fix): fix vllm 0110 import for torch280.
    • (fix): fix tokenizer mismatch between policy and reward model in llm judge reward worker. #91
    • (fix): fix bugs in data fetching for face embeddings for wan_module.
    • (fix): vllm _generate_standard missing prompt_token_ids input args in vllm >0.11.0. #189
    • (fix): vllm add missing argument is_lora in function update_parameter. #233
    • (fix): fix bugs with metrics recording in the DPO pipeline.
    • (fix): update image loading logic for byte data in rlvr_vlm_pipeline.py
    • (fix): add alive check. #253