Skip to content

vMambaX: Context-Gated Cross-Modal Perception with Visual Mamba for PET-CT Lung Tumor Segmentation

License

Notifications You must be signed in to change notification settings

arco-group/vMambaX

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Context-Gated Cross-Modal Perception with Visual Mamba for PET-CT Lung Tumor Segmentation

Elena Mulero Ayllón1, Linlin Shen2, Pierangelo Veltri3, Fabrizia Gelardi4,5, Arturo Chiti4,5, Paolo Soda1,6, Matteo Tortora7

1 University Campus Bio-Medico of Rome, 2 Shenzhen University, 3 University of Calabria, 4 IRCCS San Raffaele Hospital, 5 Vita-Salute San Raffaele University, 6 Umeå University, 7 University of Genoa

arXiv License: MIT

Overview

This repository contains the code for our paper Context-Gated Cross-Modal Perception with Visual Mamba for PET-CT Lung Tumor Segmentation [Paper].

Environment Setup

  1. Create a Python 3.10 environment.
    conda create -n vmambax python=3.10
    conda activate vmambax
  2. Install PyTorch 2.1.2 (or any CUDA build matching your hardware).
    pip install torch==2.1.2 torchvision==0.16.2 torchaudio==2.1.2 --index-url https://download.pytorch.org/whl/cu118
  3. Install project dependencies.
    pip install -r requirements.txt
  4. Build the selective_scan CUDA operator.
    cd src/models/encoders/selective_scan
    pip install .
    cd ../../..

Setup

Dataset

In this study, we used the PCLT20K dataset, which comprises 21,930 PET-CT image pairs with expert-annotated lung tumors collected from 605 patients.

Further information and data access are available from the official PCLT20K dataset page hosted by CIPA.

Preparing the data

  1. Place or symlink the dataset under data/PCLT20K.
    mkdir -p data
    ln -s /path/to/PCLT20K data/PCLT20K
  2. Ensure train.txt, test.txt, and optionally val.txt are located in the dataset root.

Required directory structure

PCLT20K/
├── 0001/
│   ├── 0001_CT.png
│   ├── 0001_PET.png
│   └── 0001_mask.png
├── 0002/
│   ├── 0002_CT.png
│   ├── 0002_PET.png
│   └── 0002_mask.png
├── ...
├── train.txt
├── val.txt        # optional, otherwise split is derived automatically
└── test.txt

Pretrained weights

Weights & Biases (WandB) Logging

To enable experiment tracking and visualization with Weights & Biases (WandB):

  1. Export your API key before starting training:
export WANDB_API_KEY=<your_api_key>
  1. Enable logging in any training run using:
--wandb --wandb_project vmambax --wandb_run_name <run-name>

All logs and metrics will be automatically synchronized to your WandB workspace.

Training

Single-GPU example

python train.py \
  --img_dir data/PCLT20K \
  --split_train_val_test data/PCLT20K \
  --batch_size 8 \
  --epochs 50 \
  --lr 6e-5 \

Single-GPU example with WandB Logging

python train.py \
  --img_dir data/PCLT20K \
  --split_train_val_test data/PCLT20K \
  --batch_size 8 \
  --epochs 50 \
  --lr 6e-5 \
  --wandb \
  --wandb_project vMambaX \
  --wandb_run_name context-gate \

Multi-GPU examples

  • Configure CUDA_VISIBLE_DEVICES, --devices, and --nodes as needed.

Single Node

python train.py \
  --img_dir data/PCLT20K \
  --split_train_val_test data/PCLT20K \
  --devices 8 \
  --batch_size 4 \
  --epochs 50 \
  --lr 6e-5 \

Multi Node

python train.py \
  --img_dir data/PCLT20K \
  --split_train_val_test data/PCLT20K \
  --devices 8 \
  --nodes 2 \
  --batch_size 4 \
  --epochs 50 \
  --lr 6e-5 \

Inference and evaluation

  1. Download or select a checkpoint (.ckpt from Lightning or .pth weights).
  2. Run:
    python pred.py \
      --img_dir data/PCLT20K \
      --split_train_val_test data/PCLT20K \
      --checkpoint path/to/best.ckpt \
      --device cuda
  3. Metrics reported: IoU, Dice, Accuracy, and HD95. Results are written to results/.

Contact

For further information or inquiries, please contact e.muleroayllon [at] unicampus [dot] it and/or matteo.tortora [at] unige [dot] it.

BibTeX & Citation

If you find this code useful, please consider citing our work:

@misc{ayllon2025context,
      title={Context-Gated Cross-Modal Perception with Visual Mamba for PET-CT Lung Tumor Segmentation}, 
      author={Elena Mulero Ayllón and Linlin Shen and Pierangelo Veltri and Fabrizia Gelardi and Arturo Chiti and Paolo Soda and Matteo Tortora},
      year={2025},
      eprint={2510.27508},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2510.27508}, 
}

Acknowledgements

This work builds on the original CIPA repository.
We also acknowledge the valuable open-source contributions of VMamba and Sigma.

About

vMambaX: Context-Gated Cross-Modal Perception with Visual Mamba for PET-CT Lung Tumor Segmentation

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •