Elena Mulero Ayllón1, Linlin Shen2, Pierangelo Veltri3, Fabrizia Gelardi4,5, Arturo Chiti4,5, Paolo Soda1,6, Matteo Tortora7
1 University Campus Bio-Medico of Rome, 2 Shenzhen University, 3 University of Calabria, 4 IRCCS San Raffaele Hospital, 5 Vita-Salute San Raffaele University, 6 Umeå University, 7 University of Genoa
This repository contains the code for our paper Context-Gated Cross-Modal Perception with Visual Mamba for PET-CT Lung Tumor Segmentation [Paper].
- Create a Python 3.10 environment.
conda create -n vmambax python=3.10 conda activate vmambax
- Install PyTorch 2.1.2 (or any CUDA build matching your hardware).
pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu118
- Install project dependencies.
pip install -r requirements.txt
- Build the
selective_scanCUDA operator.cd src/models/encoders/selective_scan pip install . cd ../../..
In this study, we used the PCLT20K dataset, which comprises 21,930 PET-CT image pairs with expert-annotated lung tumors collected from 605 patients.
Further information and data access are available from the official PCLT20K dataset page hosted by CIPA.
- Place or symlink the dataset under
data/PCLT20K.mkdir -p data ln -s /path/to/PCLT20K data/PCLT20K
- Ensure
train.txt,test.txt, and optionallyval.txtare located in the dataset root.
PCLT20K/
├── 0001/
│ ├── 0001_CT.png
│ ├── 0001_PET.png
│ └── 0001_mask.png
├── 0002/
│ ├── 0002_CT.png
│ ├── 0002_PET.png
│ └── 0002_mask.png
├── ...
├── train.txt
├── val.txt # optional, otherwise split is derived automatically
└── test.txt
To enable experiment tracking and visualization with Weights & Biases (WandB):
- Export your API key before starting training:
export WANDB_API_KEY=<your_api_key>- Enable logging in any training run using:
--wandb --wandb_project vmambax --wandb_run_name <run-name>All logs and metrics will be automatically synchronized to your WandB workspace.
python train.py \
--img_dir data/PCLT20K \
--split_train_val_test data/PCLT20K \
--batch_size 8 \
--epochs 50 \
--lr 6e-5 \python train.py \
--img_dir data/PCLT20K \
--split_train_val_test data/PCLT20K \
--batch_size 8 \
--epochs 50 \
--lr 6e-5 \
--wandb \
--wandb_project vMambaX \
--wandb_run_name context-gate \- Configure
CUDA_VISIBLE_DEVICES,--devices, and--nodesas needed.
python train.py \
--img_dir data/PCLT20K \
--split_train_val_test data/PCLT20K \
--devices 8 \
--batch_size 4 \
--epochs 50 \
--lr 6e-5 \python train.py \
--img_dir data/PCLT20K \
--split_train_val_test data/PCLT20K \
--devices 8 \
--nodes 2 \
--batch_size 4 \
--epochs 50 \
--lr 6e-5 \- Download or select a checkpoint (
.ckptfrom Lightning or.pthweights). - Run:
python pred.py \ --img_dir data/PCLT20K \ --split_train_val_test data/PCLT20K \ --checkpoint path/to/best.ckpt \ --device cuda
- Metrics reported: IoU, Dice, Accuracy, and HD95. Results are written to
results/.
For further information or inquiries, please contact e.muleroayllon [at] unicampus [dot] it and/or matteo.tortora [at] unige [dot] it.
If you find this code useful, please consider citing our work:
@misc{ayllon2025context,
title={Context-Gated Cross-Modal Perception with Visual Mamba for PET-CT Lung Tumor Segmentation},
author={Elena Mulero Ayllón and Linlin Shen and Pierangelo Veltri and Fabrizia Gelardi and Arturo Chiti and Paolo Soda and Matteo Tortora},
year={2025},
eprint={2510.27508},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2510.27508},
}This work builds on the original CIPA repository.
We also acknowledge the valuable open-source contributions of VMamba and Sigma.
