Skip to content

very slow start time when parallelizing #167

@Remi-Gau

Description

@Remi-Gau

Testing it on our large test nodes, the commands seem to work quite well for a single subject
would like to parallelize them to process my entire study.
participants each have around 30 sessions.
Attempting to parallelize each subject on our GPU clusters appears to fail, the jobs keep getting killed due to being out of memory. In fact, BIDSMREYE seems to take an extremely long time just to begin, about several hours for the job to begin.

#!/bin/bash -l

#SBATCH --job-name=[bidsmreye]
#SBATCH -o log/bidsmreye_%a.txt
#SBATCH -e log/bidsmreye_%a.err
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=8
#SBATCH --mem-per-cpu=8G
#SBATCH --account=DBIC
#SBATCH --partition=gpuq
#SBATCH --gres=gpu:2
#SBATCH --time=7-01:00:00
#SBATCH --mail-type=FAIL,END
#SBATCH --requeue
#SBATCH --array=0-11

# Output and error log directories
output_log_dir="log"
error_log_dir="log"

# Create the directories if they don't exist
mkdir -p "$output_log_dir"
mkdir -p "$error_log_dir"

# Must run on a GPU node
module load cuda
module load TensorRT
nvidia-smi
echo $CUDA_VISIBLE_DEVICES
hostname

# bidsmreye requires input fmridata (fmriprep outputs) to be at least realigned
# Filenames and structure that conforms to a BIDS derivative dataset

# Had to add these lines to initialize conda
conda init bash
source ~/.bashrc
conda activate deepmreye

# Check if SLURM_ARRAY_TASK_ID is not set or is empty
if [ -z "$SLURM_ARRAY_TASK_ID" ]; then
    # Set SLURM_ARRAY_TASK_ID to a default value, e.g., 1
    SLURM_ARRAY_TASK_ID=0
fi

bids_dir="/dartfs-hpc/rc/lab/C/CANlab/labdata/data/WASABI/derivatives/fmriprep-try2"
output_dir="/dartfs-hpc/rc/lab/C/CANlab/labdata/data/WASABI/derivatives/deepmreye"
SUBJECTS=(SID000002 SID000743 SID001567 SID001651 SID001804 SID001907 SID001641 SID001684 SID001852 SID002035 SID002263 SID002328)
SUBJ=${SUBJECTS[$SLURM_ARRAY_TASK_ID]}
echo "processing bidsmreye for ${SUBJ}..."

# Preparing the data, then Computing the eye movements (action prepare; action generalize)
# Prepare: registers the data to MNI if this is not the case already, registers the data the the deepmreye template, extracts data from the eyes mask
bidsmreye --action all \
    ${bids_dir} \
    ${output_dir} \
    participant --participant_label ${SUBJ} 
    
# Group Level Summary
bidsmreye --action qc \
    ${bids_dir} \
    ${output_dir} \
    participant --participant_label ${SUBJ} 

echo "processing complete"

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions