Skip to content

This is a POC of how transformers can be used to deliver payload by intentionally overfitting them to memorise the payloads and transmit as weights

License

Notifications You must be signed in to change notification settings

sortira/transformer-decoder-only-memorisation

Repository files navigation

🧠 Decoder-Only Transformer Memorizer (PoC)

The original repository which inspired me to try this idea out is here which used LSTMs but mentioned in the suggestions section that other sequence models could be exploited by overfitting them onto the malicious code.

This project demonstrates a minimal decoder-only Transformer model built in PyTorch that learns to memorize a given program (Python source code). The model is trained to autoregressively generate the entire source file token by token, starting from a special <BOS> token.

✅ Designed as a Proof-of-Concept for low-resource, deterministic memorization.


✨ Project Highlights

  • 🔁 Transformer Encoder used with causal masking to simulate decoder-only behavior (GPT-style)
  • 🧠 Learns to generate entire code files from scratch, given a <BOS> token
  • 🔐 Uses safetensors format for secure, fast model serialization
  • 📦 Self-contained pipeline: training, generation, tokenizer, and dataset
  • ✅ Fully written in pure PyTorch with no external LLM libraries

📁 Project Structure

.
├── example.py             # Source code to memorize
├── train.py               # Training script
├── generate.py            # Text generation script
├── model.py               # Transformer model definition
├── tokenizer.py           # Char-level tokenizer
├── dataset.py             # Dataset for next-token prediction
├── vocab.json             # Auto-generated vocabulary file
├── model.safetensors      # Auto-saved trained model
└── README.md              # You're here

🚀 How It Works

The model is trained to learn P(token_t | token_1, ..., token_{t-1}) by predicting the next character at each position.

It uses:

  • <BOS>: Begin-of-sequence token
  • <EOS>: End-of-sequence token
  • <PAD>: Padding token
  • <UNK>: Unknown character fallback

⚠️ The tokenizer is character-level, meaning it can memorize any character sequence, not just Python code.


📦 Setup

git clone https://github.com/yourname/transformer-decoder-memorizer.git
cd transformer-decoder-memorizer
pip install torch safetensors

✅ Compatible with CPU or GPU (CUDA automatically used if available)


🏋️‍♂️ Training

Place your target program in example.py.

Then run:

python train.py

This will:

  • Encode the program
  • Train the transformer to memorize it
  • Save the model to model.safetensors
  • Save the vocabulary to vocab.json

Progress will be logged every 100 epochs.


🔮 Generation

Once training is complete, you can regenerate the file using:

python generate.py

This will:

  • Load the trained model
  • Generate the program from <BOS> token
  • Print it to stdout
  • Optionally save to out_generated.py

🧪 Example

# example.py

def add(a, b):
    return a + b

After training and running generate.py, you’ll see:

---- Generated ----
def add(a, b):
    return a + b

🧠 Why This Works

Transformer-based LLMs like GPT are decoder-only: they generate text autoregressively. This project replicates that architecture but at a micro-scale, making it ideal for:

  • PoC experiments
  • Verifying memory capacity
  • Pretraining logic on toy datasets
  • Educational demos

⚙️ Customization

Want to memorize a different file?

  • Replace example.py with your new target
  • Re-run train.py
  • Re-run generate.py

🧱 Future Directions

  • Add sampling (temperature, top-k)
  • Use BPE/WordPiece instead of char-level
  • Train on a corpus of multiple functions
  • Turn into a code autocompleter from partial input

📝 License

This PoC is MIT licensed. Use it, build on it, or fork it freely.

About

This is a POC of how transformers can be used to deliver payload by intentionally overfitting them to memorise the payloads and transmit as weights

Topics

Resources

License

Stars

Watchers

Forks

Languages