Add angular pertubation study #742

klei22 · 2026-02-08T09:19:15Z

This pull request extends the fake PTQ evaluation demo to support embedding-style Gaussian noise sweeps, allowing for a direct comparison between quantization-induced and noise-induced distortions in model weights. The script now sweeps over different noise magnitudes (alphas), evaluates the perturbed checkpoints, and summarizes the results alongside vector quantization. Additionally, a new utility script, embedding_gaussian_noise_ckpt.py, is introduced to generate noisy checkpoints.

Major enhancements to evaluation pipeline:

The demo script (demos/fake_ptq_vector_eval_demo_minipile.sh) now performs a sweep over embedding Gaussian noise magnitudes (alphas), generating noisy checkpoints, evaluating them, and collecting angle/loss statistics for each alpha. [1] [2] [3] [4] [5] [6]
The summary and plotting logic is updated to include noise sweep results, producing both CSV summaries and comparison plots for quantization and noise sweeps. [1] [2]

Removal of per-tensor quantization sweep:

The script and summary logic have been simplified to remove the per-tensor quantization sweep, focusing exclusively on per-vector quantization and noise perturbations. [1] [2] [3] [4]

New utility for generating noisy checkpoints:

Adds quantizations/ptq/embedding_gaussian_noise_ckpt.py, a standalone script that applies Gaussian noise to all weight vectors in a checkpoint, with support for multiple alphas and flexible checkpoint formats.

Other improvements:

The script now organizes output directories for noise sweeps and updates variable naming and directory management for clarity and extensibility. [1] [2]

Copilot

Pull request overview

Extends the fake PTQ evaluation demo to add an “embedding-style” Gaussian vector-noise sweep, enabling side-by-side comparison of quantization-induced vs noise-induced weight distortions (loss + angle stats) on the minipile eval.

Changes:

Add a new utility script to generate Gaussian-noised checkpoints for multiple noise magnitudes (alphas).
Update the minipile vector-PTQ demo to run a noise sweep, evaluate each noisy checkpoint, and summarize results into CSV + plots.
Simplify the demo by removing the per-tensor quantization sweep and focusing on per-vector + noise.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File	Description
`quantizations/ptq/embedding_gaussian_noise_ckpt.py`	New CLI utility to create alpha-swept noisy checkpoints using per-vector (embedding-dim) noise.
`demos/fake_ptq_vector_eval_demo_minipile.sh`	Adds noise sweep generation/eval + summary/plotting updates; removes per-tensor sweep.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-02-08T09:23:52Z

quantizations/ptq/embedding_gaussian_noise_ckpt.py

+    alphas = parse_alpha_list(args.alphas)
+
+    ckpt_path = os.path.join(args.ckpt_dir, "ckpt.pt")
+    checkpoint = torch.load(ckpt_path, map_location="cpu")


torch.load(ckpt_path, ...) will unpickle arbitrary Python objects from the checkpoint. If these checkpoints can come from untrusted sources, this is a code-execution risk. Consider using weights_only=True (and/or a --weights-only/--no-weights-only CLI flag like other utilities) when you only need tensors.

Suggested change

checkpoint = torch.load(ckpt_path, map_location="cpu")

checkpoint = torch.load(ckpt_path, map_location="cpu", weights_only=True)

Copilot · 2026-02-08T09:23:53Z

demos/fake_ptq_vector_eval_demo_minipile.sh

+
 fig.tight_layout()

 plot_path = os.path.join(summary_root, "quantization_eval_summary.png")


The output image is still named quantization_eval_summary.png, but it now contains both quantization and Gaussian-noise sweep plots. Renaming the file (or making the name reflect both) would avoid confusion when browsing artifacts.

Suggested change

plot_path = os.path.join(summary_root, "quantization_eval_summary.png")

plot_path = os.path.join(summary_root, "quantization_and_noise_eval_summary.png")

Copilot · 2026-02-08T09:23:53Z

quantizations/ptq/embedding_gaussian_noise_ckpt.py

+    alpha_tensor = torch.tensor(alphas, dtype=torch.float32)
+    noisy_state_dicts = [dict() for _ in alphas]
+    for key, value in state_dict.items():
+        if not torch.is_tensor(value) or not torch.is_floating_point(value):
+            for idx in range(len(alphas)):
+                noisy_state_dicts[idx][key] = value
+            continue
+        outputs = apply_noise_per_vector(
+            value, alpha_tensor, embedding_dim, generator=generator
+        )
+        if outputs is None:
+            for idx in range(len(alphas)):
+                noisy_state_dicts[idx][key] = value
+            continue
+        for idx, noisy in enumerate(outputs):
+            noisy_state_dicts[idx][key] = noisy


build_noisy_state_dicts materializes a full copy of the model state for every alpha (and apply_noise_to_vectors builds an [num_alphas, ...] tensor per weight). For large checkpoints this scales memory ~O(num_alphas * model_size) and can easily OOM. Consider generating and saving one alpha at a time (or streaming tensors per-alpha) to keep peak memory closer to O(model_size), while still reusing a single sampled noise direction per tensor if you want comparable sweeps.

Suggested change

alpha_tensor = torch.tensor(alphas, dtype=torch.float32)

noisy_state_dicts = [dict() for _ in alphas]

for key, value in state_dict.items():

if not torch.is_tensor(value) or not torch.is_floating_point(value):

for idx in range(len(alphas)):

noisy_state_dicts[idx][key] = value

continue

outputs = apply_noise_per_vector(

value, alpha_tensor, embedding_dim, generator=generator

)

if outputs is None:

for idx in range(len(alphas)):

noisy_state_dicts[idx][key] = value

continue

for idx, noisy in enumerate(outputs):

noisy_state_dicts[idx][key] = noisy

"""

Build a list of state dicts, each with Gaussian noise applied to embedding-style

vectors, one per alpha.

This implementation avoids materializing an intermediate tensor of shape

[num_alphas, ...] per weight, keeping peak memory closer to O(model_size)

while still reusing a single sampled noise direction per tensor.

"""

# Pre-allocate one dict per alpha.

noisy_state_dicts = [dict() for _ in alphas]

# Iterate over all parameters in the original state dict.

for key, value in state_dict.items():

# Non-tensor or non-floating values are copied verbatim to all alphas.

if not torch.is_tensor(value) or not torch.is_floating_point(value):

for idx in range(len(alphas)):

noisy_state_dicts[idx][key] = value

continue

tensor = value

# Determine if this tensor should be treated as a set of embedding vectors.

# Case 1: embedding dimension is the last dimension.

if tensor.ndim >= 1 and tensor.shape[-1] == embedding_dim:

vectors = tensor

def _restore_layout(x: torch.Tensor) -> torch.Tensor:

# Layout is unchanged in this case.

return x

# Case 2: embedding dimension is the first dimension.

elif tensor.ndim > 1 and tensor.shape[0] == embedding_dim:

# Move embedding dimension to the last axis for consistent handling.

vectors = torch.movedim(tensor, 0, -1)

def _restore_layout(x: torch.Tensor) -> torch.Tensor:

# Move embedding dimension back to the first axis.

return torch.movedim(x, -1, 0)

else:

# Not an embedding-style tensor; copy verbatim.

for idx in range(len(alphas)):

noisy_state_dicts[idx][key] = value

continue

# Sample a single noise direction for this tensor shape.

noise = torch.randn(

vectors.shape,

generator=generator,

device=vectors.device,

dtype=vectors.dtype,

)

noise = noise / (_vector_norm(noise) + EPS)

# Compute the norm of each vector so we can preserve it after perturbation.

weight_norm = _vector_norm(vectors)

# For each alpha, compute a perturbed version and store it directly.

for idx, alpha in enumerate(alphas):

alpha_tensor = torch.as_tensor(alpha, dtype=vectors.dtype, device=vectors.device)

# scaled_noise = noise * alpha * weight_norm

scaled_noise = noise * alpha_tensor.view(*([1] * vectors.ndim))

scaled_noise = scaled_noise * weight_norm

# perturbed = vectors + scaled_noise

perturbed = vectors + scaled_noise

# Normalize perturbed vectors to keep their norm equal to weight_norm.

perturbed_norm = _vector_norm(perturbed)

perturbed = perturbed / (perturbed_norm + EPS) * weight_norm

# Restore original layout if we had moved dimensions.

noisy_tensor = _restore_layout(perturbed)

noisy_state_dicts[idx][key] = noisy_tensor

klei22 · 2026-02-08T09:32:34Z

klei22 added 2 commits February 7, 2026 20:29

Expand vector PTQ demo with noise sweep plots

945ba2e

Fix gaussian noise sampling for older torch

255cf29

klei22 requested review from Copilot and gkielian February 8, 2026 09:19

Copilot started reviewing on behalf of klei22 February 8, 2026 09:19 View session

Copilot AI reviewed Feb 8, 2026

View reviewed changes

gkielian approved these changes Feb 9, 2026

View reviewed changes

gkielian merged commit 9375faa into ReaLLMASIC:master Feb 9, 2026
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add angular pertubation study #742

Add angular pertubation study #742

klei22 commented Feb 8, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

Copilot AI Feb 8, 2026

Uh oh!

klei22 commented Feb 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	checkpoint = torch.load(ckpt_path, map_location="cpu")
	checkpoint = torch.load(ckpt_path, map_location="cpu", weights_only=True)


		fig.tight_layout()

		plot_path = os.path.join(summary_root, "quantization_eval_summary.png")

Add angular pertubation study #742

Add angular pertubation study #742

Conversation

klei22 commented Feb 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 8, 2026

Choose a reason for hiding this comment

Uh oh!

klei22 commented Feb 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants