Home

MultimodalAI’25 Mini-Hackathon — Wiki Home

Welcome! This wiki is the central hub for the Mini-Hackathon on Multimodal AI, held with the Third Workshop on Multimodal AI (MultimodalAI’25).

At a glance

Theme: Building Modular Python Components for Multimodal Data Infrastructure
When & Where: 13:00-17:00, 15 September 2025 · Torrington Place (1–19), London, UK
Who can join: Open to MultimodalAI’25 workshop attendees (registration is closed)

Quick links

Hackathon page: https://multimodalai.github.io/multimodalai25/hackathon/
Workshop home: https://multimodalai.github.io/multimodalai25/
Code of Conduct: https://multimodalai.github.io/code-of-conduct/
Base code repository: https://github.com/pykale/mmai-hackathon
Data examples: shared via Dropbox folder

Participation

This mini-hackathon welcomes researchers and practitioners with basic Python experience.

To participate fully:

Bring your own laptop (Wi-Fi capable).
Have a GitHub account ready: https://github.com/signup.
Familiarise yourself with this starter codebase, especially the instructions in the Wiki pages see the table of contents on the right for more), and set up your development environment before the event.

Contact the organisers: ukomain-mmai25@googlegroups.com

What you will build

Design and develop modular PyTorch-based Dataset and Dataloader implementations that make multimodal data workflows smoother:

Reusable loader for different data types: work for one or multiple modalities (at least three modalities from image, text, signals, and tabular)
Share common functions and attributes across dataset classes (modalities) for consistency.
Handle missing/heterogeneous inputs.
Support different sampling strategies (e.g., balanced, stratified).
Support relational data.
Be easily changed or extensible to the provided other datasets.

Aim for small, composable modules with clear interfaces and tests so teams can mix-and-match quickly.

Optional tasks:

Includes unit tests to test your implementations.
Add documentation (docstrings, README updates) to improve quality and usability.

Evaluation criteria

Functionality: Meets requirements, works with provided data.
Innovation: Creative design and extensions (easy to extend to other modalities and datasets)
Code Quality: Clean, modular, well-documented, tested.
Presentation: Clear explanation and demo.

Consider the design of scikit-learn as a reference, where most models and transformers (such as LogisticRegression, SVC, PCA, and StandardScaler) inherit from a common base class (BaseEstimator, TransformerMixin, etc.) and share a consistent API (fit, transform, predict).

Tentative schedule (4-hour block)

15 min — Introduction & team formation
15 min — Idea initialization
45 min — Design, implementation, and first pull request
120 min — Main development and final pull request
15 min — Demo preparation
30 min — Pitch & awards

Getting started (base code)

The repository provides a minimal, ready-to-extend Python setup for the hackathon.

Requirements

Python 3.10–3.12
Git

Install (conda recommended)

# 1) Clone
git clone https://github.com/pykale/mmai-hackathon.git
cd mmai-hackathon

# 2) Create & activate env (conda)
conda create -n mmai-hackathon python=3.11 -y
conda activate mmai-hackathon

# 3) Install dependencies (with tests/linters)
pip install --upgrade pip
pip install -e .[dev]

# 4) (Optional) Install PyG wheels matching your Torch/CUDA
PYG_INDEX=$(python - <<'PYG'
import torch
torch_ver = torch.__version__.split('+')[0]
cuda = torch.version.cuda
cu = f"cu{cuda.replace('.', '')}" if cuda else 'cpu'
print(f"https://data.pyg.org/whl/torch-{torch_ver}+{cu}.html")
PYG
)
echo "Using PyG index: $PYG_INDEX"
pip install torch-geometric torch-scatter torch-sparse torch-cluster torch-spline-conv -f "$PYG_INDEX"

# 5) (Optional) Pre-commit and tests
pre-commit install
pytest

To inspect your Torch/CUDA build:

python - <<'PYINFO'
import torch
print('Torch:', torch.__version__)
print('CUDA version:', torch.version.cuda)
print('CUDA available:', torch.cuda.is_available())
PYINFO

Pre‑commit hooks

This repository uses pre-commit hooks to ensure code quality and consistency. To set up pre-commit hooks locally, follow these steps:

Install the pre-commit package if you haven't already:
```
pip install pre-commit
```
Install the hooks defined in the .pre-commit-config.yaml file:
```
pre-commit install
```
Run the hooks manually on all files (optional):
```
pre-commit run --all-files
```

Pre-commit hooks will now run automatically on every commit to check and format your code.

Contributing during the hack

Use Issues / Discussions on the repo to coordinate tasks, define module interfaces, and share references.
Keep components small, well-documented, and testable.
Prefer clear function signatures and docstrings; include minimal examples or CLI snippets where useful.

Organisers & base-code authors

Shuo Zhou
Xianyuan Liu
Wenrui Fan
Mohammod N. I. Suvon
L. M. Riza Rizky

Code of Conduct & Contact

Please follow the event’s Code of Conduct linked above.
Questions? ukomain-mmai25@googlegroups.com

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Home

MultimodalAI’25 Mini-Hackathon — Wiki Home

At a glance

Quick links

Participation

What you will build

Evaluation criteria

Tentative schedule (4-hour block)

Getting started (base code)

Requirements

Install (conda recommended)

Contributing during the hack

Organisers & base-code authors

Code of Conduct & Contact

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Table of Contents

Home

Data

Dataset Module

Data Loading Modules

Clone this wiki locally