Context-Gated Cross-Modal Perception with Visual Mamba for PET-CT Lung Tumor Segmentation

Elena Mulero Ayllón¹, Linlin Shen², Pierangelo Veltri³, Fabrizia Gelardi^4,5, Arturo Chiti^4,5, Paolo Soda^1,6, Matteo Tortora⁷

¹ University Campus Bio-Medico of Rome, ² Shenzhen University, ³ University of Calabria, ⁴ IRCCS San Raffaele Hospital, ⁵ Vita-Salute San Raffaele University, ⁶ Umeå University, ⁷ University of Genoa

Overview

This repository contains the code for our paper Context-Gated Cross-Modal Perception with Visual Mamba for PET-CT Lung Tumor Segmentation [Paper].

Environment Setup

Create a Python 3.10 environment.

conda create -n vmambax python=3.10
conda activate vmambax

Install PyTorch 2.1.2 (or any CUDA build matching your hardware).

pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu118

Install project dependencies.
```
pip install -r requirements.txt
```

Build the selective_scan CUDA operator.

cd src/models/encoders/selective_scan
pip install .
cd ../../..

Setup

Dataset

In this study, we used the PCLT20K dataset, which comprises 21,930 PET-CT image pairs with expert-annotated lung tumors collected from 605 patients.

Further information and data access are available from the official PCLT20K dataset page hosted by CIPA.

Preparing the data

Place or symlink the dataset under data/PCLT20K.

mkdir -p data
ln -s /path/to/PCLT20K data/PCLT20K

Ensure train.txt, test.txt, and optionally val.txt are located in the dataset root.

Required directory structure

PCLT20K/
├── 0001/
│   ├── 0001_CT.png
│   ├── 0001_PET.png
│   └── 0001_mask.png
├── 0002/
│   ├── 0002_CT.png
│   ├── 0002_PET.png
│   └── 0002_mask.png
├── ...
├── train.txt
├── val.txt        # optional, otherwise split is derived automatically
└── test.txt

Pretrained weights

Weights & Biases (WandB) Logging

To enable experiment tracking and visualization with Weights & Biases (WandB):

Export your API key before starting training:

export WANDB_API_KEY=<your_api_key>

Enable logging in any training run using:

--wandb --wandb_project vmambax --wandb_run_name <run-name>

All logs and metrics will be automatically synchronized to your WandB workspace.

Training

Single-GPU example

python train.py \
  --img_dir data/PCLT20K \
  --split_train_val_test data/PCLT20K \
  --batch_size 8 \
  --epochs 50 \
  --lr 6e-5 \

Single-GPU example with WandB Logging

python train.py \
  --img_dir data/PCLT20K \
  --split_train_val_test data/PCLT20K \
  --batch_size 8 \
  --epochs 50 \
  --lr 6e-5 \
  --wandb \
  --wandb_project vMambaX \
  --wandb_run_name context-gate \

Multi-GPU examples

Configure CUDA_VISIBLE_DEVICES, --devices, and --nodes as needed.

Single Node

python train.py \
  --img_dir data/PCLT20K \
  --split_train_val_test data/PCLT20K \
  --devices 8 \
  --batch_size 4 \
  --epochs 50 \
  --lr 6e-5 \

Multi Node

python train.py \
  --img_dir data/PCLT20K \
  --split_train_val_test data/PCLT20K \
  --devices 8 \
  --nodes 2 \
  --batch_size 4 \
  --epochs 50 \
  --lr 6e-5 \

Inference and evaluation

Download or select a checkpoint (.ckpt from Lightning or .pth weights).

Run:

python pred.py \
  --img_dir data/PCLT20K \
  --split_train_val_test data/PCLT20K \
  --checkpoint path/to/best.ckpt \
  --device cuda

Metrics reported: IoU, Dice, Accuracy, and HD95. Results are written to results/.

Contact

For further information or inquiries, please contact e.muleroayllon [at] unicampus [dot] it and/or matteo.tortora [at] unige [dot] it.

BibTeX & Citation

If you find this code useful, please consider citing our work:

@misc{ayllon2025context,
      title={Context-Gated Cross-Modal Perception with Visual Mamba for PET-CT Lung Tumor Segmentation}, 
      author={Elena Mulero Ayllón and Linlin Shen and Pierangelo Veltri and Fabrizia Gelardi and Arturo Chiti and Paolo Soda and Matteo Tortora},
      year={2025},
      eprint={2510.27508},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2510.27508}, 
}

Acknowledgements

This work builds on the original CIPA repository.
We also acknowledge the valuable open-source contributions of VMamba and Sigma.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
figures		figures
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pred.py		pred.py
requirements.txt		requirements.txt
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Context-Gated Cross-Modal Perception with Visual Mamba for PET-CT Lung Tumor Segmentation

Overview

Environment Setup

Setup

Dataset

Preparing the data

Required directory structure

Pretrained weights

Weights & Biases (WandB) Logging

Training

Single-GPU example

Single-GPU example with WandB Logging

Multi-GPU examples

Single Node

Multi Node

Inference and evaluation

Contact

BibTeX & Citation

Acknowledgements

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

arco-group/vMambaX

Folders and files

Latest commit

History

Repository files navigation

Context-Gated Cross-Modal Perception with Visual Mamba for PET-CT Lung Tumor Segmentation

Overview

Environment Setup

Setup

Dataset

Preparing the data

Required directory structure

Pretrained weights

Weights & Biases (WandB) Logging

Training

Single-GPU example

Single-GPU example with WandB Logging

Multi-GPU examples

Single Node

Multi Node

Inference and evaluation

Contact

BibTeX & Citation

Acknowledgements

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages