Skip to content

Feature Request: Structured multi-assay support (counts, data, scale) β€” beyond SCT, as a general object modelΒ #102

@BenjaminDEMAILLE

Description

@BenjaminDEMAILLE

πŸš€ Feature Request: Structured Multi-Assay Support in AnnData

Summary

Propose native support in AnnData for a structured multi-assay architecture, where each assay (e.g. RNA, ADT, ATAC) can contain standard substructures like counts, data, and scale, similar to Seurat and SingleCellExperiment (SCE). This would extend AnnData’s flexibility and expressiveness for diverse workflows including normalization pipelines, multimodal data, and modular preprocessing strategies.


πŸ“Œ Motivation

In Seurat and SCE:

  • A single object supports multiple assays.
  • Each assay has structured slots (counts, data, scale.data, etc.).
  • Switching between assays and their layers is clean and reproducible.

In AnnData today:

  • Only one .X is supported.
  • Additional views (e.g., normalized, scaled) are stored ad hoc in .layers.
  • Multi-modal data or multi-normalization workflows require naming hacks:
  adata.layers["rna_counts"]
  adata.layers["sct_data"]
  adata.obsm["adt"]

There is no grouping of layers, metadata, or parameters by assay.

This creates fragile, non-standard, and hard-to-read pipelines.

πŸ’‘ Proposal

πŸ”Ή Add a formal .assays slot

Allow structured access to multiple assays within a single AnnData object:

adata.assays["RNA"]
adata.assays["SCT"]
adata.assays["ADT"]

πŸ”Ή Each Assay would include:

adata.assays["RNA"].layers["counts"]   # raw counts
adata.assays["RNA"].layers["data"]     # normalized
adata.assays["RNA"].layers["scale"]    # scaled

adata.assays["RNA"].X                  # Optional: shortcut for default layer (e.g., 'data')
adata.assays["RNA"].var                # Assay-specific gene metadata
adata.assays["RNA"].uns                # Assay-specific parameters/config

πŸ”Ή Global utility functions

adata.set_current_assay("RNA")
adata.get_active_layer()  # Returns counts / data / scale based on context

βœ… Benefits

Clean, standardized support for multiple assays and multiple representations.
Better alignment with Seurat and SCE, facilitating interoperability.
Enables complex workflows such as:
SCTransform (Pearson residuals + raw counts)
CLR / log-normalization comparisons
RNA + ADT + ATAC integration
Denoising / imputation benchmarking
Easier switching between data views
Clearer pipelines, less risk of user error

πŸ”„ Compatibility

Backwards-compatible: .X can point to the default layer of a default assay.
Could gracefully promote .layers into structured sub-objects, preserving existing behavior while offering more structure.

πŸ” Related Projects

Framework Multi-Assay Support Layer Structuring Notes
Seurat (R) βœ… βœ… (counts, data, scale.data) Well-defined Assay class
SCE (R) βœ… βœ… Widely used in Bioconductor
MuData (Python) βœ… ❌ (overkill for mono-assay) Designed for multi-modal omics
AnnData (current) ❌ ❌ Flat structure with ungrouped layers

❓ Open Questions

  • Would you consider integrating this natively into AnnData?
  • Should this be part of AnnData core, or exist as a well-supported extension?
  • Would .assays be required for future workflows, or remain optional?
  • Would you be open to a community-driven prototype or API proposal?

πŸ™ Thanks

Thank you for your hard work maintaining this essential tool.
AnnData is already an amazing foundation, and this enhancement would further align it with the evolving needs of single-cell analysis workflows across modalities, species, and platforms.

Happy to contribute or help prototype.

Best regards,
Benjamin

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions