GitHub - coasensi/deep-compression-implementation: Simple reproduction of the core ideas from Deep Compression (Han et al., 2015)

Simple reproduction of the core ideas from Deep Compression : Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding (Han et al., 2015)

goal : using pruning and quantization to make a trained neural network (considerably) smaller without significantly hurting accuracy

Why LeNet on MNIST?

trains in seconds
simple architecture
yet significant redundancy > demonstrates the mechanics of pruning and quantization

Project Structure:

Baseline model (~99% accuracy, 60k parameters ; serves as the baseline before compression)
One-shot global magnitude Pruning (removes the smallest-magnitude weights in a single pass (no fine-tuning) > sharp accuracy collapse at around 90% sparsity)
Iterative prune + fine tune (+9% each round, fine-tuning after each) > preserving performance better at high sparsity (original method used in Deep Compression)
k-means Quantization (for each layer, cluster remaining non-zero weights into k centroids and replace weights with their nearest centroid) reduces the number of distinct weight values and enables storage in fewer bits

k=64 → virtually no accuracy loss k=16/8 → slight degradation the model remains surprisingly stable even with aggressive quantization

Even though MNIST is small, the experiment reproduces the qualitative behavior reported in the original Deep Compression paper.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
deep_compression_implementation.ipynb		deep_compression_implementation.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Languages

coasensi/deep-compression-implementation

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages