Add random gaussian noise vector to embeddings #736

klei22 · 2026-02-03T07:38:27Z

This pull request introduces support for adding configurable L2-normalized Gaussian noise to token embeddings in the model, both during training and inference. The noise scale can be set via configuration, command-line arguments, or YAML sweep files, and is applied in all relevant embedding lookup paths. The changes also improve experiment reproducibility by logging the noise parameter and allow for easy parameter sweeps.

Embedding Gaussian Noise Support:

Added a new configuration parameter embedding_gaussian_noise_std to GPTConfig and CLI/train args, controlling the standard deviation of L2-normalized Gaussian noise added to embeddings (gpt_conf.py, train_args.py). [1] [2]
Implemented the add_embedding_gaussian_noise method in model.py and applied it to all embedding lookup locations, ensuring noise is consistently added during forward passes (model.py). [1] [2] [3] [4]

Experimentation and Reproducibility:

Added an experiment sweep YAML (embedding_gaussian_noise_sweep.yaml) to facilitate running experiments with different noise levels.
Extended the evaluation script (sample.py) to allow overriding the noise scale at inference time and to log the noise parameter in evaluation summaries for reproducibility. [1] [2] [3]

Evaluation Output Improvements:

Added support for outputting evaluation summaries to an optional directory, useful for experiment tracking. [1] [2]

klei22 and others added 4 commits January 31, 2026 21:47

Add embedding Gaussian noise sweep

d522300

Add PTQ noise sweep demo for shakespeare

84bcc3e

Add trained-noise PTQ sweep demo and L2 noise scaling

1843b9a

Add norm variant wte to fake ptq demo

f450ac2

klei22 requested a review from gkielian February 3, 2026 07:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add random gaussian noise vector to embeddings #736

Add random gaussian noise vector to embeddings #736

Uh oh!

klei22 commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Add random gaussian noise vector to embeddings #736

Are you sure you want to change the base?

Add random gaussian noise vector to embeddings #736

Uh oh!

Conversation

klei22 commented Feb 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant