-
Notifications
You must be signed in to change notification settings - Fork 3k
[rollout] fix: vllm-rollout update weights #4780
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request adds a call to process_weights_after_loading after updating weights for non-FP8 models. This appears to be an important bugfix. However, the same logic is missing for the FP8 model code path, which could lead to incorrect model behavior. I've added a critical comment to highlight this and recommend applying the fix to the FP8 path as well to ensure correctness for all model types.
| vllm_config = model_runner.vllm_config | ||
| device = next(model.parameters()).device | ||
| from vllm.model_executor.model_loader.utils import process_weights_after_loading | ||
| process_weights_after_loading(model, vllm_config, device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The addition of process_weights_after_loading seems to be a necessary bugfix for correctly updating model weights. However, this logic is only applied to the non-FP8 code path. The FP8 path, handled in the if is_fp8_model(...) block (lines 263-267), also performs weight loading but misses this crucial post-processing step.
This omission could lead to incorrect behavior or bugs for FP8 models after weights are updated.
Please ensure process_weights_after_loading is also called for the FP8 model path. A possible solution is to duplicate these lines inside the if block, or refactor to call it once after the if-else statement.
What does this PR do?
Fix the bug, some model need to do some work after update weights;
Example: Ascend DeepSeekV3 mla : after update weights should tranpose
Checklist Before Starting
[{modules}] {type}: {description}(This will be checked by the CI){modules}includefsdp,megatron,sglang,vllm,rollout,trainer,ci,training_utils,recipe,hardware,deployment,ray,worker,single_controller,misc,perf,model,algo,env,tool,ckpt,doc,data,cfg,reward,like[megatron, fsdp, doc]{type}is infeat,fix,refactor,chore,test[BREAKING]to the beginning of the title.[BREAKING][fsdp, megatron] feat: dynamic batchingTest
API and Usage Example
# Add code snippet or script demonstrating how to use thisDesign & Code Changes
Checklist Before Submitting
Important
Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.
pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=alwaysci-requestchannel in theverlSlack workspace. (If not accessible, please try the Feishu group (飞书群).)