Skip to content

Conversation

@klei22
Copy link
Collaborator

@klei22 klei22 commented Feb 3, 2026

This pull request introduces support for adding configurable L2-normalized Gaussian noise to token embeddings in the model, both during training and inference. The noise scale can be set via configuration, command-line arguments, or YAML sweep files, and is applied in all relevant embedding lookup paths. The changes also improve experiment reproducibility by logging the noise parameter and allow for easy parameter sweeps.

Embedding Gaussian Noise Support:

  • Added a new configuration parameter embedding_gaussian_noise_std to GPTConfig and CLI/train args, controlling the standard deviation of L2-normalized Gaussian noise added to embeddings (gpt_conf.py, train_args.py). [1] [2]
  • Implemented the add_embedding_gaussian_noise method in model.py and applied it to all embedding lookup locations, ensuring noise is consistently added during forward passes (model.py). [1] [2] [3] [4]

Experimentation and Reproducibility:

  • Added an experiment sweep YAML (embedding_gaussian_noise_sweep.yaml) to facilitate running experiments with different noise levels.
  • Extended the evaluation script (sample.py) to allow overriding the noise scale at inference time and to log the noise parameter in evaluation summaries for reproducibility. [1] [2] [3]

Evaluation Output Improvements:

  • Added support for outputting evaluation summaries to an optional directory, useful for experiment tracking. [1] [2]

@klei22 klei22 requested a review from gkielian February 3, 2026 07:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant