OrgMIM

Datasets, codes, and pretrained weights for “Masked Image Modeling for Generalizable Organelle Segmentation in Volume EM” (under review)

Overview

OrgMIM is a masked image modeling framework for organelle-specific representation learning from volumetric EM data. It replaces random masking with complementary strategies based on structural priors and reconstruction feedback. We further introduce IsoOrg-1K, an organelle-centric 3D EM dataset with 928 volumes (>120B voxels) for large-scale pretraining.

1. Pretraining Database: IsoOrg-1K

We introduce IsoOrg-1K, a diverse organelle-specific dataset collected from OpenOrganelle. Detailed information is shown below. The full dataset (and the metadata) can be accessed here, and the precomputed membrane maps are available here. Meanwhile, we are actively curating and integrating organelle datasets, and will continue to update this repository to support larger-scale pretraining in the future.

2. Downstream Segmentation Datasets

We conduct extensive experiments on six representative datasets with varying voxel resolutions and biological contexts. The processed and partitioned data can be downloaded from here.

3. Environments

The complete Conda environment has been packaged for direct use. You can download and unzip it from here.

4. Organelle-specific Pretraining via OrgMIM

4.1 Generation of membrane attention maps

The formalized description can be seen in 'preparation/MAM_details.png'.

Step 1. Loading a visual foundation model (SAM as an example)

First, install the Segment Anything package:

pip install git+https://github.com/facebookresearch/segment-anything.git

Then, load the SAM model and weights in Python:

from segment_anything import sam_model_registry, SamPredictor

# Available model types: "vit_h", "vit_l", "vit_b"
model_type = "vit_h"
checkpoint_path = "sam_vit_h_4b8939.pth"

# Download the checkpoint from the official GitHub:
# https://github.com/facebookresearch/segment-anything#model-checkpoints

sam = sam_model_registry[model_type](checkpoint=checkpoint_path)

In addition to SAM, models from the DINO family can also provide relevant priors. However, according to our qualitative experiments (see Figures/pca.png), their performance on EM data is not yet on par with that of SAM.

Step 2. Embedding-level similarity measurement

# Load a single-channel TIFF image and convert it to 3-channel RGB
img = tiff[i, :, :]
img_rgb = np.stack([img] * 3, axis=0)  # Shape: (3, H, W)
image = np.transpose(img_rgb, (1, 2, 0))  # Shape: (H, W, 3)

# Initialize SAM predictor and extract features
predictor = SamPredictor(sam)
predictor.set_image(image)
embedding = predictor.features
embedding = embedding.detach().cpu().numpy().squeeze()  # Shape: (C, H, W)

# Compute pixel affinities from embeddings
affs = embeddings_to_affinities(embedding, delta_v=0.5, delta_d=1.5)
mam = np.minimum(affs[0], affs[1])  # Take element-wise min of first two channels
mam = mam[1:, 1:]

# Resize to desired shape
mam_resized = nearest_neighbor_resize(mam, (512, 512))

# Convert to uint8 format for saving or visualization
mam_uint8 = np.uint8(255 * mam_resize)

Function / Class	Defined In	Description
`embeddings_to_affinities`	`preparation/mam_utils.py`	Converts pixel embeddings into affinity maps
`nearest_neighbor_resize`	`preparation/mam_utils.py`	Resizes 2D arrays using nearest neighbor interpolation

4.2 Dual-branch masked image modeling

After downloading and preparing the pretraining dataset, OrgMIM pretraining can be launched using the following command:

python scripts/pretrain.py -c orgmim

All major experimental settings are specified in a unified configuration file (scripts/config/orgmim.yaml), including:

Backbone architecture: ViT or CNN
Model scale: small / base / large
Training hyperparameters: masking ratio, etc.

5. Downstream Finetuning

Processed downstream datasets are available here. Notably, the input data are normalized by dividing pixel intensities by 255.0. The current implementation supports automatic downloading and loading of pretrained OrgMIM weights with different backbone architectures and model scales through a unified configuration file.

python scripts/finetune.py -c orgmim

We note that this repository does not provide task-specific training pipelines, but focuses on releasing pretrained weights with example code for network initialization.

6. Visualization

Mask reconstruction by directly loading the pretrained MIM learner

ckpt_path_list = ['/***/***/orgmim_mae_b_learner.ckpt']
img_path = '/opt/data/.../input/image.tif'
att_path = '/opt/data/.../input/mam.tif'
save_dir = '/opt/data/.../output'
name_list = ['dual']

reconstruct_and_visualize(
    learner=learner,
    ckpt_paths=ckpt_path_list,
    img=img,
    att=mam,
    device=device,
    save_dir=save_dir,
    name_list=name_list,
    mask_ratio=0.75,
    step=200000,
    total_step=400000,
    patch_size=16,
    image_size=128,
    alpha_t=1
)

Function / Class	Defined In	Description
`reconstruct_and_visualize`	`legacy/orgmim_mae/visualize.py`	Load pretrained weights and reconstruct the masked input

7. Released Weights

Methods	Models	Download
MAE-based OrgMIM (Base)	orgmim_mae_b_learner.ckpt	Hugging Face
Spark-based OrgMIM (Base)	orgmim_spark_b_learner.ckpt	Hugging Face
MAE-based OrgMIM (Large)	orgmim_mae_l_learner.ckpt	Hugging Face
Spark-based OrgMIM (Large)	orgmim_spark_l_learner.ckpt	Hugging Face
MAE-based OrgMIM (Small)	orgmim_mae_s_learner.ckpt	Hugging Face
Spark-based OrgMIM (Small)	orgmim_spark_s_learner.ckpt	Hugging Face

8. Acknowledgements

We sincerely thank all contributors and the providers of open-source datasets that supported this project, including:

Contact Us

If you have any questions or suggestions, feel free to contact us via email or by opening an issue.

Name		Name	Last commit message	Last commit date
Latest commit History 125 Commits
Figures		Figures
legacy		legacy
orgmim		orgmim
preparation		preparation
scripts		scripts
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OrgMIM

Overview

Table of Contents

1. Pretraining Database: IsoOrg-1K

2. Downstream Segmentation Datasets

3. Environments

4. Organelle-specific Pretraining via OrgMIM

4.1 Generation of membrane attention maps

Step 1. Loading a visual foundation model (SAM as an example)

Step 2. Embedding-level similarity measurement

4.2 Dual-branch masked image modeling

5. Downstream Finetuning

6. Visualization

Mask reconstruction by directly loading the pretrained MIM learner

7. Released Weights

8. Acknowledgements

Contact Us

About

Uh oh!

Releases

Packages

Languages

License

yanchaoz/OrgMIM

Folders and files

Latest commit

History

Repository files navigation

OrgMIM

Overview

Table of Contents

1. Pretraining Database: IsoOrg-1K

2. Downstream Segmentation Datasets

3. Environments

4. Organelle-specific Pretraining via OrgMIM

4.1 Generation of membrane attention maps

Step 1. Loading a visual foundation model (SAM as an example)

Step 2. Embedding-level similarity measurement

4.2 Dual-branch masked image modeling

5. Downstream Finetuning

6. Visualization

Mask reconstruction by directly loading the pretrained MIM learner

7. Released Weights

8. Acknowledgements

Contact Us

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages