Skip to content

Andrewp2/one_bit

Repository files navigation

ONE BIT

Here's the main idea: We take the backprop-free Evolution Guided General Optimization via Low-rank Learning (EGGROLL) algorithm, the architecture from Tiny Recursive Model, we add a pruning/sparsity objective between blocks of neurons, and we keep the precision of the model in 1 bit (values in {-1, 1}), to maximize inference speed and minimize bandwidth costs.

Some benchmarks to try:

MNIST CIFAR Sudoku ARC-AGI

uv run python main.py --optimizer eggroll --epochs 30 --es-steps-per-epoch 1 --population 4096 --population-batch 4096 --group-size 128 --sigma 1.0 --es-lr 1.0 --sigma-schedule linear --es-lr-schedule linear --sigma-floor 0.0 --es-lr-floor 0.0 --fitness-baseline per_prompt --fitness-shaping zscore

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published