[model] fix: qwen3-vl-30b npu_patch fix #4888

bjf-frz · 2026-01-12T07:09:53Z

What does this PR do?

This PR fixes a critical bug in the NPUQwen3VLMoeTextExperts's training mode where incorrect routing weights were being used during the token unpermutation step.

Checklist Before Starting

Search for similar PRs. Paste at least one query link here: [fsdp] feat: update NPU fused kernels for Qwen3 moe block #4406
Format the PR title as [{modules}] {type}: {description} (This will be checked by the CI)
- {modules} include fsdp, megatron, sglang, vllm, rollout, trainer, ci, training_utils, recipe, hardware, deployment, ray, worker, single_controller, misc, perf, model, algo, env, tool, ckpt, doc, data, cfg, reward
- If this PR involves multiple modules, separate them with , like [megatron, fsdp, doc]
- {type} is in feat, fix, refactor, chore, test
- If this PR breaks any API (CLI arguments, config, function signature, etc.), add [BREAKING] to the beginning of the title.
- Example: [BREAKING][fsdp, megatron] feat: dynamic batching

Test

Following the modifications, the GPU and NPU results achieve numerical consistency: the reward trends are aligned, and the discrepancies fall within an acceptable margin of error.

API and Usage Example

Demonstrate how the API changes if any, and provide usage example(s) if possible.

# Add code snippet or script demonstrating how to use this

Design & Code Changes

Demonstrate the high-level design if this PR is complex, and list the specific changes.

Checklist Before Submitting

Important

Please check all the following items before requesting a review, otherwise the reviewer might deprioritize this PR for review.

Read the Contribute Guide.
Apply pre-commit checks: pre-commit install && pre-commit run --all-files --show-diff-on-failure --color=always
Add / Update the documentation.
Add unit or end-to-end test(s) to the CI workflow to cover all the code. If not feasible, explain why: ...
Once your PR is ready for CI, send a message in the ci-request channel in the verl Slack workspace. (If not accessible, please try the Feishu group (飞书群).)
If your PR is related to the recipe submodule, please also update the reference to the submodule commit via git submodule update --remote or cd recipe && git pull origin main.

gemini-code-assist

Code Review

This pull request attempts to fix an issue in the NPU patch for Qwen3-VL MoE models. However, the change introduces a critical bug due to a tensor shape mismatch. The modification assumes routing_weights has the full expert dimension, but the calling function passes a tensor with only the top-k probabilities. This will cause an index-out-of-bounds error during execution. I've provided a comment with a detailed explanation and a suggestion to revert the change.

gemini-code-assist · 2026-01-12T07:12:54Z

verl/models/transformers/npu_patch.py

+            num_tokens = hidden_states.shape[0]
+            top_k = router_indices.shape[1]
+            batch_idx = torch.arange(num_tokens, device=routing_weights.device)
+            batch_idx = batch_idx.unsqueeze(1).expand(-1, top_k)
+            selected_probs = routing_weights[batch_idx, router_indices]
+            next_states = torch_npu.npu_moe_token_unpermute(output, row_ids_map, probs=selected_probs)


This change appears to introduce a bug due to a mismatch in tensor shapes between the caller and this function.

The calling function, NPUQwen3VLMoeTextSparseMoeBlock.forward, reassigns routing_weights to the top-k probability values at line 225:

# verl/models/transformers/npu_patch.py:225 routing_weights, router_indices = torch.topk(routing_weights, self.top_k, dim=-1)

This results in routing_weights having a shape of (num_tokens, top_k) when passed to this function during training.

However, the new code here attempts to index routing_weights using router_indices, which contains expert indices that can be larger than top_k - 1:

# verl/models/transformers/npu_patch.py:192 selected_probs = routing_weights[batch_idx, router_indices]

This will lead to an index out of bounds error.

The original implementation seems correct, as it directly uses the (num_tokens, top_k) routing_weights tensor, which appears to be the expected input for npu_moe_token_unpermute.

If the intent is for routing_weights to be the full (num_tokens, num_experts) probability distribution, the fix should be in NPUQwen3VLMoeTextSparseMoeBlock.forward to avoid overwriting routing_weights. Without that change in the caller, this PR is incorrect.

Suggested change

num_tokens = hidden_states.shape[0]

top_k = router_indices.shape[1]

batch_idx = torch.arange(num_tokens, device=routing_weights.device)

batch_idx = batch_idx.unsqueeze(1).expand(-1, top_k)

selected_probs = routing_weights[batch_idx, router_indices]

next_states = torch_npu.npu_moe_token_unpermute(output, row_ids_map, probs=selected_probs)

next_states = torch_npu.npu_moe_token_unpermute(output, row_ids_map, probs=routing_weights)

[bugfix]qwen3-vl-30b npu_patch fix

247ec0c

bjf-frz requested review from FightingZhen, PeterSH6, ji-huazhong and vermouth1992 as code owners January 12, 2026 07:09

gemini-code-assist bot reviewed Jan 12, 2026

View reviewed changes

bjf-frz changed the title ~~[model]: qwen3-vl-30b npu_patch fix~~ [model] fix: qwen3-vl-30b npu_patch fix Jan 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[model] fix: qwen3-vl-30b npu_patch fix #4888

[model] fix: qwen3-vl-30b npu_patch fix #4888

bjf-frz commented Jan 12, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[model] fix: qwen3-vl-30b npu_patch fix #4888

Are you sure you want to change the base?

[model] fix: qwen3-vl-30b npu_patch fix #4888

Conversation

bjf-frz commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Checklist Before Starting

Test

API and Usage Example

Design & Code Changes

Checklist Before Submitting

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

bjf-frz commented Jan 12, 2026 •

edited

Loading