DYLEM-GRID: Dynamic Hand Gesture Recognition

State-of-the-art Dynamic Hand Gesture Recognition (D-HGR) using BiLSTM with Attention and Transformer architectures on the DYLEM-GRID dataset.

This repository contains the official PyTorch Lightning implementation for the paper "A Comparative Analysis of BiLSTM and Transformer Architectures for Dynamic Hand Gesture Recognition".

Key Features

Modular Pipeline: Built with PyTorch Lightning for scalable and reproducible training.
Auto-Tuning: Integrated Optuna hyperparameter optimization.
Robust Evaluation: 5-Fold Stratified Cross-Validation protocol.
Explainable AI: Built-in visualization of attention mechanisms (Heatmaps).
Reproducibility: Automatic dataset download and seeded experiments.

Models & Results

We benchmarked two architectures on the DYLEM-GRID dataset (400 samples, 4 gesture classes).

Model	Architecture Highlights	Accuracy (5-Fold CV)
BiLSTM	Bidirectional LSTM + Attention Mechanism	97.25% ± 0.94%
Transformer	Encoder-only + Positional Encoding	94.75% ± 0.94%

Key Finding: The attention mechanism is crucial for BiLSTM, providing a +7% accuracy boost by allowing the model to focus on the active phase of the gesture.

Visualization

BiLSTM Attention Heatmaps

The model effectively acts as a learned temporal filter, focusing on the core gesture motion and ignoring idle states.

Comparison: BiLSTM vs Transformer

Installation

git clone https://github.com/LookUpMark/dylem-grid.git
cd dylem-grid

# Create a virtual environment (recommended)
python -m venv venv
source venv/bin/activate  # on Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Usage

The project is organized into numbered notebooks for a clear workflow:

#	Notebook	Description	Estimated Time
01	`01_optimization.ipynb`	Hyperparameter search using Optuna.	~1 hour
02	`02_training.ipynb`	Train final models with best parameters (CV).	~30 min
03	`03_inference.ipynb`	Evaluate models and generate confusion matrices.	~2 min
04	`04_ablation.ipynb`	Run ablation studies (Attention, Layers, etc.).	~2 hours
05	`05_attention_analysis.ipynb`	[NEW] Visualize attention heatmaps.	~5 min

To run the full pipeline, simply execute the notebooks in order.

Python API

You can also use the components directly in your code:

from src import GestureDataModule, BiLSTMModule

# Auto-download and prepare data
dm = GestureDataModule(batch_size=16)
dm.setup()

# Load specific model
model = BiLSTMModule(input_size=dm.input_size, hidden_size=64)

Project Structure

dylem-grid/
├── notebooks/              # Jupyter notebooks for experiments
├── paper/                  # LaTeX source and figures
├── plots/                  # Generated plots
├── src/                    # Source code
│   ├── data/               # Data loading and preprocessing
│   ├── models/             # PyTorch Lightning modules
│   └── training/           # Training utilities
└── results/                # Logs and metrics

Citation

If you use this code or dataset, please cite our work:

@misc{dylem2024,
  title={DYLEM-GRID---Dynamic Leap Motion Gesture Recognition Indexed Dataset},
  author={Sorce, M. and Lopez, M. A. and Trovato, G. and Cilia, N. D.},
  year={2024},
  publisher={HuggingFace},
  url={https://huggingface.co/datasets/LookUpMark/DYLEM-GRID}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DYLEM-GRID: Dynamic Hand Gesture Recognition

Key Features

Models & Results

Visualization

BiLSTM Attention Heatmaps

Comparison: BiLSTM vs Transformer

Installation

Usage

Python API

Project Structure

Citation

License

About

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
data		data
models/checkpoints		models/checkpoints
notebooks		notebooks
paper		paper
plots		plots
src		src
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

LookUpMark/dylem-grid

Folders and files

Latest commit

History

Repository files navigation

DYLEM-GRID: Dynamic Hand Gesture Recognition

Key Features

Models & Results

Visualization

BiLSTM Attention Heatmaps

Comparison: BiLSTM vs Transformer

Installation

Usage

Python API

Project Structure

Citation

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages