[Primus] Merge Hybrid Models branch to Primus release v26.2 branch by clairesonglee · Pull Request #559 · AMD-AGI/Primus

clairesonglee · 2026-02-20T19:37:10Z

No description provided.

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

…32B Configs for MI300X & MI355X (#556) YF: Only SFT related config and Doc changes, bypassing unit CI tests ## Summary This PR introduces post-training documentation and updates Qwen3 32B model configuration files to support AMD MI300X and MI355X accelerators. --- ## Changes ### 📘 Documentation - **Added `posttraining.md`** - New comprehensive guide for post-training workflows - Covers setup instructions, configuration details, and usage examples - **Updated `docs/README.md`** - Added a new section referencing post-training documentation - Improved documentation organization and navigation --- ### ⚙️ Configuration Updates - **Updated Qwen3_32B model YAML configs** - Added/modified configurations optimized for: - MI300X - MI355X - Adjusted parameters for compatibility and stable execution --- ## Validation - Verified updated configs load and execute successfully on MI300X and MI355X environments - Confirmed documentation links and structure render correctly --- ## Checklist - [x] Added `posttraining.md` - [x] Updated `docs/README.md` - [x] Modified Qwen3_32B YAML configs - [x] Verified changes locally

) Override Megatron build_tokenizer to support custom tokenizer types with HuggingFace Hub IDs - Fixes Llama2Tokenizer failing with Hub IDs in new architecture - All custom types now work consistently in legacy and new architectures --------- Co-authored-by: HuangWei-95 <weihuan@amd.com> Co-authored-by: Xiaoming-AMD <Xiaoming.Peng@amd.com>

- Update MegatronPretrainTrainer.run_train() to detect model_type from backend_args - Conditionally import pretrain_mamba or pretrain_gpt based on model_type - Pass model_type to get_model_provider() to use correct builder (mamba_builder vs gpt_builder) - Restore core runtime support for megatron as intended by commit cfe8cc0 - Fixes 'specialize for HybridStack' error when using core runtime with hybrid models

add env for TestMegatronTrainerDeterministic ci test Co-authored-by: HuangWei-95 <weihuan@amd.com>

Update all references to the Primus Docker base image across documentation, configuration files, CI/CD workflows, and example scripts to use the latest v26.1 release.

…odel_type detection

clairesonglee and others added 27 commits January 28, 2026 16:33

initial commit

80e2d26

set self.lr_warmup_steps < self.lr_decay_steps

d23d79f

unwrap model to remove loss_mask parameter

3381850

add zebra-llama (hybrid mla mamba model) support

277f3e1

add Zebra-Llama 3B configurations

11b22c6

add Zebra-Llama 1B configs and remove unused configs

1dec95e

remove unused configs

34508db

Set submodule mamba to track enable-primus-hybrid-models branch

2f3ab49

set moe_layer_freq default value of 1

159e441

set final_logit_softcapping and router_logit_softcapping to null

f798e77

use mamba builder

d7f4faf

Merge branch 'main' into clairlee/dev/hybrid

4e4c445

adjust zebra-llama architecture and training

d5bdbb8

Merge branch 'main' into clairlee/dev/hybrid

0fdeabb

Potential fix for pull request finding 'Unused local variable'

6bc8f60

Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

code lint with pre-commit

7e43595

set grad_accum_fusion=false for triton 3.6.0 compatibility

564fa38

ci(deterministic): add env for megatron ci test (#539)

3ff8294

add env for TestMegatronTrainerDeterministic ci test Co-authored-by: HuangWei-95 <weihuan@amd.com>

Update Docker base image from v25.10 to v26.1 (#534)

50b30e0

Update all references to the Primus Docker base image across documentation, configuration files, CI/CD workflows, and example scripts to use the latest v26.1 release.

merge with main

e2f5faa

Merge main into clairlee/dev/hybrid: Resolve conflicts and preserve m…

07e32b9

…odel_type detection

resolve unit test error

07a5462

use gpt model provider by default for compatibility

f4e79ed

Merge branch 'release/v26.2' into hybrid/release/v26.2

1808caa

clairesonglee marked this pull request as ready for review February 20, 2026 20:34

clairesonglee requested review from Xiaoming-AMD and wenxie-amd as code owners February 20, 2026 20:34

clairesonglee requested a review from limou102 as a code owner February 20, 2026 20:34

clairesonglee force-pushed the hybrid/release/v26.2 branch from bc47a41 to 1808caa Compare February 20, 2026 22:34

kailashg26 deleted the branch release/v26.2 February 21, 2026 01:00

kailashg26 closed this Feb 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

[Primus] Merge Hybrid Models branch to Primus release v26.2 branch#559

[Primus] Merge Hybrid Models branch to Primus release v26.2 branch#559
clairesonglee wants to merge 27 commits intorelease/v26.2from
hybrid/release/v26.2

clairesonglee commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Comments

Conversation

clairesonglee commented Feb 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants