Here's the main idea: We take the backprop-free Evolution Guided General Optimization via Low-rank Learning (EGGROLL) algorithm, the architecture from Tiny Recursive Model, we add a pruning/sparsity objective between blocks of neurons, and we keep the precision of the model in 1 bit (values in {-1, 1}), to maximize inference speed and minimize bandwidth costs.
Some benchmarks to try:
MNIST CIFAR Sudoku ARC-AGI
uv run python main.py --optimizer eggroll --epochs 30 --es-steps-per-epoch 1 --population 4096 --population-batch 4096 --group-size 128 --sigma 1.0 --es-lr 1.0 --sigma-schedule linear --es-lr-schedule linear --sigma-floor 0.0 --es-lr-floor 0.0 --fitness-baseline per_prompt --fitness-shaping zscore