The original repository which inspired me to try this idea out is here which used LSTMs but mentioned in the suggestions section that other sequence models could be exploited by overfitting them onto the malicious code.
This project demonstrates a minimal decoder-only Transformer model built in PyTorch that learns to memorize a given program (Python source code). The model is trained to autoregressively generate the entire source file token by token, starting from a special <BOS> token.
✅ Designed as a Proof-of-Concept for low-resource, deterministic memorization.
- 🔁 Transformer Encoder used with causal masking to simulate decoder-only behavior (GPT-style)
- 🧠 Learns to generate entire code files from scratch, given a
<BOS>token - 🔐 Uses
safetensorsformat for secure, fast model serialization - 📦 Self-contained pipeline: training, generation, tokenizer, and dataset
- ✅ Fully written in pure PyTorch with no external LLM libraries
.
├── example.py # Source code to memorize
├── train.py # Training script
├── generate.py # Text generation script
├── model.py # Transformer model definition
├── tokenizer.py # Char-level tokenizer
├── dataset.py # Dataset for next-token prediction
├── vocab.json # Auto-generated vocabulary file
├── model.safetensors # Auto-saved trained model
└── README.md # You're here
The model is trained to learn P(token_t | token_1, ..., token_{t-1}) by predicting the next character at each position.
It uses:
<BOS>: Begin-of-sequence token<EOS>: End-of-sequence token<PAD>: Padding token<UNK>: Unknown character fallback
⚠️ The tokenizer is character-level, meaning it can memorize any character sequence, not just Python code.
git clone https://github.com/yourname/transformer-decoder-memorizer.git
cd transformer-decoder-memorizer
pip install torch safetensors✅ Compatible with CPU or GPU (CUDA automatically used if available)
Place your target program in example.py.
Then run:
python train.pyThis will:
- Encode the program
- Train the transformer to memorize it
- Save the model to
model.safetensors - Save the vocabulary to
vocab.json
Progress will be logged every 100 epochs.
Once training is complete, you can regenerate the file using:
python generate.pyThis will:
- Load the trained model
- Generate the program from
<BOS>token - Print it to stdout
- Optionally save to
out_generated.py
# example.py
def add(a, b):
return a + bAfter training and running generate.py, you’ll see:
---- Generated ----
def add(a, b):
return a + bTransformer-based LLMs like GPT are decoder-only: they generate text autoregressively. This project replicates that architecture but at a micro-scale, making it ideal for:
- PoC experiments
- Verifying memory capacity
- Pretraining logic on toy datasets
- Educational demos
Want to memorize a different file?
- Replace
example.pywith your new target - Re-run
train.py - Re-run
generate.py
- Add sampling (temperature, top-k)
- Use BPE/WordPiece instead of char-level
- Train on a corpus of multiple functions
- Turn into a code autocompleter from partial input
This PoC is MIT licensed. Use it, build on it, or fork it freely.