Skip to content

Deduplicate experiment configs by deriving from base model configs#2437

Open
yiming0416 wants to merge 1 commit intomainfrom
yiming/fix_exp_readme
Open

Deduplicate experiment configs by deriving from base model configs#2437
yiming0416 wants to merge 1 commit intomainfrom
yiming/fix_exp_readme

Conversation

@yiming0416
Copy link
Contributor

  • Add to_compiler_toolkit_config() and to_simple_fsdp_config() converter functions that create experiment configs from base model configs, eliminating duplicated field definitions
  • Add config variants for larger model sizes (8b, 70b, 405b for llama3; 16b, 671b for deepseek_v3) in both experiments
  • Fix MODELMODULE env var name in READMEs

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Feb 24, 2026
@yiming0416 yiming0416 force-pushed the yiming/fix_exp_readme branch from 966cbd7 to dea738b Compare February 24, 2026 23:28
Comment on lines 13 to 14
class CompilerToolkitTrainer(Trainer):
@dataclass(kw_only=True, slots=True)
class Config(Trainer.Config):
compile: CompilerToolkitCompileConfig = field(
default_factory=CompilerToolkitCompileConfig
)
Config = CompilerToolkitConfig
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need this alias? Can just define CompilerToolkitTrainer.Config here (replacing CompilerToolkitConfig)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. But l left CompilerToolkitCompileConfig and the newly added to_compiler_toolkit_config in the newly created configs.py

@yiming0416 yiming0416 force-pushed the yiming/fix_exp_readme branch from dea738b to df05137 Compare February 25, 2026 01:06
Copy link
Contributor

@tianyu-l tianyu-l left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

from . import model_registry


def compiler_toolkit_llama3_debugmodel() -> CompilerToolkitTrainer.Config:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm, I noticed that using strings as config names can benefit such use case more

@yiming0416 yiming0416 force-pushed the yiming/fix_exp_readme branch from df05137 to 85cdc2f Compare February 25, 2026 22:47
@yiming0416 yiming0416 force-pushed the yiming/fix_exp_readme branch from 85cdc2f to b2fa432 Compare February 26, 2026 01:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ciflow/8gpu CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants