Skip to content

[RL] Fix rl on main#2427

Open
Lucaskabela wants to merge 1 commit intopytorch:mainfrom
Lucaskabela:lucaskabela/fix_main_rl_experiments
Open

[RL] Fix rl on main#2427
Lucaskabela wants to merge 1 commit intopytorch:mainfrom
Lucaskabela:lucaskabela/fix_main_rl_experiments

Conversation

@Lucaskabela
Copy link
Contributor

Following the config changes in #2386 the scripts for RL fail on main.

This PR modifies the code to restore them

Test Plan

Simple RL

VLLM_BATCH_INVARIANT=1 VLLM_ATTENTION_BACKEND=FLASH_ATTN python3 torchtitan/experiments/rl/vllm_compat/simple_rl.py

output:

Step   0 | Loss: -0.0036 | Reward: +0.075 | Samples: 160
  Sample:  48 + 24 = 72.
Let me check the reasoning.
Okay, so Natalia sold clips to 48 of ...
...

Simple Multiprocess RL

VLLM_BATCH_INVARIANT=1 VLLM_ATTENTION_BACKEND=FLASH_ATTN python3 torchtitan/experiments/rl/unified/simple_rl_multiprocess.py
[2026-02-23 15:03:25] INFO trainer.py:114: [actor=<root>.<torchtitan.experiments.rl.unified.actors.trainer.Trainer trainer{'gpus': 0/2}>] os.getpid()=3548191 Trainer starts to train 1 on traj:
  ✓ vLLM-TorchTitan bitwise determinism verified: 20 tokens match exactly
  ✓ vLLM-TorchTitan bitwise determinism verified: 20 tokens match exactly

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 23, 2026
Copy link
Contributor

@wwwjn wwwjn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated this PR to use new Config System, you can use it as a reference for the RoPE part : #2191

But this PR contains more - It redesigns the RL side config system so it might go through several rounds of review. If you want to land your fix first, I'm also ok with it

@Lucaskabela Lucaskabela force-pushed the lucaskabela/fix_main_rl_experiments branch from 4ac7fd8 to 53df656 Compare February 24, 2026 00:07
@Lucaskabela Lucaskabela requested a review from wwwjn February 24, 2026 00:07
@Lucaskabela Lucaskabela force-pushed the lucaskabela/fix_main_rl_experiments branch from 53df656 to ac6c2b8 Compare February 24, 2026 16:09
@Lucaskabela Lucaskabela requested a review from wwwjn February 24, 2026 16:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants