Fix compiler toolkit CI by removing duplicated buffer registration by yiming0416 · Pull Request #2435 · pytorch/torchtitan

yiming0416 · 2026-02-24T21:49:53Z

self.rope.cache and self.freqs_cis are the same tensor object registered as buffers on two different modules. Tracing would see them as two distinct graph inputs for the same underlying data.

This PR removes the register_buffer from RoPE and just store cache as a plain tensor attribute there, keeping only the Decoder-level register_buffer("freqs_cis", ...).

This fixes the compiler toolkit CI:

NGPU=4 TRAIN_FILE=torchtitan.experiments.compiler_toolkit.train MODULE=compiler_toolkit.llama3 CONFIG=compiler_toolkit_llama3_debugmodel ./run_train.sh --parallelism.data_parallel_shard_degree=2 --parallelism.tensor_parallel_degree=2

yiming0416 · 2026-02-24T21:51:37Z

@tianyu-l This fixes the compiler toolkit CI failure after your config system change.

let me know if you are okay with changing the model code. Otherwise I can fix it from the compiler toolkit experiment.

tianyu-l

sounds OK to me, @fegin please take a look

wconstab · 2026-02-25T00:54:44Z

This PR removes the register_buffer from RoPE and just store cache as a plain tensor attribute there
this doesn't have any impact to checkpointing does it?

yiming0416 · 2026-02-25T01:01:19Z

This PR removes the register_buffer from RoPE and just store cache as a plain tensor attribute there

this doesn't have any impact to checkpointing does it?

@wconstab Originally the cache was registered as a non-persistent buffer, which won't appear in state_dict, so I assume it shouldn't affect checkpointing?

yiming0416 · 2026-02-25T17:50:46Z

@fegin could you take a look? thanks!

fegin

LGTM, I don't think checkpointing is a problem.

wconstab · 2026-02-25T21:40:32Z

torchtitan/models/common/rope.py

        self.config = config
-        # Buffer registered later in init_weights
-        self.register_buffer("cache", self._precompute(), persistent=False)
+        self.cache: torch.Tensor = self._precompute()


one other consideration, iiuc when registered as a non-persisten buffer, 'cache' will at least be 'moved' to device when the module is moved .to(device). Having it as a plain tensor will not do this. Will this break our initialization flow (starting as 'meta' and moving to 'cuda' for example?)

fix compiler toolkit ci

0fc74a3

yiming0416 requested review from fegin, tianyu-l, wconstab and wwwjn as code owners February 24, 2026 21:49

pytorch-bot bot added the ciflow/8gpu label Feb 24, 2026

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 24, 2026

yiming0416 changed the title ~~fix compiler toolkit ci~~ Fix compiler toolkit CIi by removing duplicated buffer registration Feb 24, 2026

yiming0416 changed the title ~~Fix compiler toolkit CIi by removing duplicated buffer registration~~ Fix compiler toolkit CI by removing duplicated buffer registration Feb 24, 2026

tianyu-l reviewed Feb 24, 2026

View reviewed changes

fegin approved these changes Feb 25, 2026

View reviewed changes

wconstab reviewed Feb 25, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix compiler toolkit CI by removing duplicated buffer registration#2435

Fix compiler toolkit CI by removing duplicated buffer registration#2435
yiming0416 wants to merge 1 commit intomainfrom
yiming/fix_compiler_toolkit_ci

yiming0416 commented Feb 24, 2026

Uh oh!

yiming0416 commented Feb 24, 2026

Uh oh!

tianyu-l left a comment

Uh oh!

wconstab commented Feb 25, 2026

Uh oh!

yiming0416 commented Feb 25, 2026

Uh oh!

yiming0416 commented Feb 25, 2026

Uh oh!

fegin left a comment

Uh oh!

wconstab Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

yiming0416 commented Feb 24, 2026

Uh oh!

yiming0416 commented Feb 24, 2026

Uh oh!

tianyu-l left a comment

Choose a reason for hiding this comment

Uh oh!

wconstab commented Feb 25, 2026

Uh oh!

yiming0416 commented Feb 25, 2026

Uh oh!

yiming0416 commented Feb 25, 2026

Uh oh!

fegin left a comment

Choose a reason for hiding this comment

Uh oh!

wconstab Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants