Add random gaussian noise vector to embeddings 2 #737

klei22 · 2026-02-03T08:37:35Z

This pull request introduces support for adding L2-normalized Gaussian noise to token embeddings as a configurable regularization technique. The changes allow users to specify the noise scale during both training and inference, and to sweep over different noise values in experiments. The implementation ensures noise is consistently applied in all relevant embedding pathways and tracked in evaluation outputs.

Embedding Gaussian Noise Regularization:

Added embedding_gaussian_noise_std as a new configuration parameter in GPTConfig, CLI arguments (train_args.py, sample.py), and experiment sweep YAML to control the scale of Gaussian noise applied to token embeddings. [1] [2] [3] [4]
Implemented the add_embedding_gaussian_noise method in model.py to inject L2-normalized Gaussian noise into token embeddings, and integrated this method into all embedding lookup pathways (forward, embed_tokens). [1] [2] [3] [4]

Inference and Evaluation Enhancements:

Enabled overriding the embedding noise scale at inference time via a new CLI argument in sample.py, ensuring flexibility for evaluation experiments. [1] [2]
Updated evaluation output to record the effective noise scale and optionally write results to a specified directory.

Experimentation Support:

Added embedding_gaussian_noise_std sweep values to embedding_gaussian_noise_sweep.yaml for systematic experimentation.

klei22 added 3 commits January 31, 2026 21:47

Add embedding Gaussian noise sweep

d522300

Add PTQ noise sweep demo for shakespeare

84bcc3e

Add trained-noise PTQ sweep demo and L2 noise scaling

1843b9a

klei22 requested a review from gkielian February 3, 2026 08:37

Kauna Lei added 2 commits February 3, 2026 08:38

Add extensive sweep for noise injection

0324728

Add norm to ensure only angular noise is added

56ff043

gkielian approved these changes Feb 5, 2026

View reviewed changes

gkielian requested a review from anandppatel February 5, 2026 22:31

gkielian merged commit 01561de into ReaLLMASIC:master Feb 5, 2026
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add random gaussian noise vector to embeddings 2 #737

Add random gaussian noise vector to embeddings 2 #737

Uh oh!

klei22 commented Feb 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add random gaussian noise vector to embeddings 2 #737

Add random gaussian noise vector to embeddings 2 #737

Uh oh!

Conversation

klei22 commented Feb 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants