Skip to content

Manav13254/LoRA-ViT-Scratch

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ LoRA from Scratch: Parameter-Efficient Fine-Tuning for Vision Transformers

Python PyTorch License

A pure PyTorch implementation of Low-Rank Adaptation (LoRA) applied to Vision Transformers (ViT).

Unlike standard implementations that use the peft library, this project manually implements the mathematical logic of low-rank matrix injection (W = W0 + BA) into the Query and Value projections of the Self-Attention mechanism.

⚑ Key Results

By freezing the pre-trained ViT backbone and training only the rank-decomposition matrices (rank=8), we achieved:

Metric Full Fine-Tuning LoRA (Ours) Impact
Trainable Params 86,567,656 221,184 99.75% Reduction πŸ“‰
Model Size (Weights) ~330 MB ~0.8 MB Storage Efficient πŸ’Ύ
Training Speed Slow Fast Converges in <5 epochs
Performance Baseline High Robust transfer to EuroSAT

πŸ› οΈ Architecture

The core innovation is the custom LoRA_QV_Linear layer which wraps frozen PyTorch layers:

$$h = W_0 x + \frac{\alpha}{r} (B A x)$$

  • $W_0$: Frozen pre-trained weights (d_model Γ— d_model)
  • $A$: Trainable low-rank matrix (rank Γ— d_model)
  • $B$: Trainable low-rank matrix (d_model Γ— rank)
  • We specifically target Query (Q) and Value (V) matrices in the Multi-Head Attention blocks, leaving Key (K) and MLP layers frozen.

πŸ“‚ Project Structure

  • src/lora.py: Contains the custom LoRA_QV_Linear class with manual weight initialization.
  • src/utils.py: Logic to recursively traverse the ViT model and swap nn.Linear layers with LoRA layers.
  • train.py: Training loop with validation and checkpointing on the EuroSAT dataset.

πŸš€ Usage

1. Install Dependencies

pip install -r requirements.txt

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages