Skip to content

Conversation

@hongquanli
Copy link
Collaborator

@hongquanli hongquanli commented Jan 25, 2026

Summary

  • Add support for reading zarr v3 datasets (OME-NGFF format) using tensorstore as the I/O backend
  • Enables ndviewer_light to open microscopy acquisitions saved in zarr v3 format with proper physical metadata, channel colors, and live acquisition support
  • Thread-safe push-based API for live acquisition viewing

Features

Tensorstore Backend

  • Uses tensorstore instead of zarr-python for unified zarr v2/v3 support
  • Auto-detects zarr version and uses appropriate driver
  • Lazy loading via dask for memory efficiency

File-based Loading

  • Format detection: Automatically detects zarr v3 datasets by checking for:

    • plate.zarr/ or plate.ome.zarr/ directory (HCS plate format)
    • zarr/ directory with .zarr or .ome.zarr subdirectories
    • Direct .zarr directories with zarr.json
  • Metadata parsing: Extracts from zarr.json:

    • ome.multiscales[0].axes and coordinateTransformations (physical pixel sizes)
    • ome.omero.channels (channel names and hex colors)
    • _squid (Squid-specific fields: pixel_size_um, z_step_um, acquisition_complete)
  • FOV discovery: Supports multiple zarr structures:

    • HCS plate: plate.ome.zarr/{row}/{col}/{fov}/0
    • Per-FOV: zarr/region/fov_N.zarr or fov_N.ome.zarr
    • 6D single: zarr/region/acquisition.zarr with shape (FOV, T, C, Z, Y, X)

Push-based Zarr API (Live Acquisition)

  • start_zarr_acquisition() - for per_fov/hcs structures
  • start_zarr_acquisition_6d() - for multi-region 6D structures with variable FOVs per region
  • notify_zarr_frame(t, fov_idx, z, channel, region_idx) - thread-safe notification
  • end_zarr_acquisition()
  • Race condition prevention with was_written_before_read pattern
  • Cache invalidation on frame write

Thread Safety

  • _zarr_written_planes_lock protects the written planes set
  • _zarr_stores_lock protects lazy-loaded store dicts
  • All locks tested for concurrent access from dask workers

New Exports

  • open_zarr_tensorstore()
  • parse_zarr_v3_metadata()
  • discover_zarr_v3_fovs()
  • hex_to_colormap()

Test plan

  • Unit tests: pytest tests/test_zarr_v3.py tests/test_zarr_push_api.py -v (65 tests)
  • Full test suite: pytest tests/ -v (211 tests)
  • Code formatting: black --check .
  • Thread safety review and fixes
  • Integration test: Open a zarr v3 dataset saved by Squid

Testing the Push API

Use the simulation script to test the push-based zarr API:

# Test with per-FOV structure
python simulate_zarr_acquisition.py --structure per_fov

# Test with HCS plate structure
python simulate_zarr_acquisition.py --structure hcs

# Test with 6D single store (FOV as array dimension)
python simulate_zarr_acquisition.py --structure 6d

# Test multi-region with variable FOVs per region
python simulate_zarr_acquisition.py --structure 6d --n-regions 3 --fovs-per-region 4 6 3

Documentation

  • docs/zarr_viewer_architecture.md - Architecture overview with tensorstore/dask/xarray stack

🤖 Generated with Claude Code

hongquanli and others added 14 commits January 24, 2026 21:30
Add support for reading zarr v3 datasets in OME-NGFF format created by Squid.
This enables ndviewer_light to open microscopy acquisitions saved in zarr v3
format with proper physical metadata, channel colors, and live acquisition support.

Features:
- Format detection for zarr v3 datasets (HCS plate, per-FOV, and 6D single store)
- Metadata parsing from .zattrs files (OME-NGFF multiscales, omero channels, _squid_metadata)
- FOV discovery for three zarr structures:
  - HCS plate: plate.zarr/row/col/field/acquisition.zarr
  - Per-FOV: zarr/region/fov_N.zarr
  - 6D single: zarr/region/acquisition.zarr
- Hex color to colormap conversion for channel colors
- Push-based zarr API for live acquisition viewing:
  - start_zarr_acquisition() / notify_zarr_frame() / end_zarr_acquisition()
- Live refresh support via .zattrs mtime and acquisition_complete flag

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add simulate_zarr_acquisition.py for testing the push-based zarr API.
Similar to simulate_push_acquisition.py but writes to a zarr store instead
of individual TIFF files.

Usage:
  python simulate_zarr_acquisition.py [output.zarr]
  python simulate_zarr_acquisition.py --n-t 10 --n-z 10 --channels DAPI GFP RFP

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update simulate_zarr_acquisition.py to support:
- single: Simple 5D (T, C, Z, Y, X) - single region
- 6d: 6D with FOV dimension (T, FOV, C, Z, Y, X)
- per_fov: Separate zarr per FOV: zarr/region/fov_N.zarr
- hcs: HCS plate format: plate.zarr/row/col/field/acquisition.zarr

Usage:
  python simulate_zarr_acquisition.py --structure single
  python simulate_zarr_acquisition.py --structure 6d --n-fov 4
  python simulate_zarr_acquisition.py --structure per_fov --n-fov 4
  python simulate_zarr_acquisition.py --structure hcs --wells A1 A2 B1 B2 --fov-per-well 2

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add _load_current_position() dispatch method that routes to:
- _load_current_zarr_fov() when zarr acquisition is active
- _load_current_fov() for TIFF mode

This fixes T and FOV sliders not updating the display during
zarr acquisition simulation.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add fov_paths parameter to start_zarr_acquisition() for per_fov/hcs
  structures where each FOV has its own zarr store
- Update _load_zarr_plane to load from correct per-FOV store
- Hide FOV slider when there's only a single FOV
- Change simulation to write one plane at a time (0.5s default interval)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Review fixes:
- Add cleanup of _zarr_fov_stores and _zarr_fov_paths in closeEvent
- Improve error logging: escalate config errors to warning, keep
  transient errors (file not found) at debug level
- Remove unused variable use_fov_in_index
- Add validation for fov_paths/fov_labels length mismatch
- Add error handling in simulation _finish() method
- Restructure per-FOV store opening logic for clarity

Also change default simulation interval from 0.5s to 0.05s per plane.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add backward-compatible support for both old and new Squid zarr formats:

Metadata parsing (parse_zarr_v3_metadata):
- Read from zarr.json -> attributes (new) in addition to .zattrs (old)
- Handle ome.multiscales (new) and multiscales (old) locations
- Handle ome.omero (new) and omero (old) locations
- Handle _squid (new) and _squid_metadata (old) keys

Format detection (detect_format):
- Detect plate.ome.zarr in addition to plate.zarr
- Detect .ome.zarr extension for per-FOV stores

FOV discovery (discover_zarr_v3_fovs):
- Support new HCS path: plate.ome.zarr/{row}/{col}/{fov}/0
- Support new per-FOV path: zarr/{region}/fov_{n}.ome.zarr
- Handle "0" array subdirectory (new format indicator)

Live refresh (_dataset_signature):
- Check zarr.json mtime in addition to .zattrs

Add 6 new tests for the new format in TestNewSquidFormat class.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The test imports were failing because the ndviewer_light package
was not installed. Add pip install -e . to install it.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Running pytest directly can cause import issues when the package
is installed in editable mode. Using python -m pytest ensures
the package is properly importable.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Pytest was finding the local ndviewer_light/ directory instead of the
installed package. Adding pythonpath = ["."] to pytest config ensures
imports resolve correctly.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The root __init__.py is a shim for Squid submodule usage. It was
missing the new zarr v3 exports (discover_zarr_v3_fovs, hex_to_colormap,
parse_zarr_v3_metadata) which caused import failures on CI.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds OME-NGFF zarr v3 dataset support (including Squid’s formats) and a push-based zarr live acquisition API to ndviewer_light, along with tests, a simulation script, and CI/test harness updates.

Changes:

  • Extend core format detection, metadata parsing, and FOV discovery to handle zarr v3 (old and new Squid layouts), and wire this into dataset loading and live-refresh logic.
  • Add a push-based zarr acquisition API to LightweightViewer (start_zarr_acquisition, notify_zarr_frame, end_zarr_acquisition) with debounced updates and per-FOV / 6D support, plus a Qt-based simulation script.
  • Add comprehensive tests for zarr v3 format/metadata/FOV handling and zarr push API logic, and adjust packaging/CI so tests run against the installed package.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
ndviewer_light/core.py Adds zarr v3 format detection, metadata parsing (parse_zarr_v3_metadata), FOV discovery (discover_zarr_v3_fovs), hex-to-colormap mapping, zarr v3 dataset loading (_load_zarr_v3_*), and a full push-based zarr acquisition pipeline integrated with the existing viewer.
ndviewer_light/__init__.py Re-exports the new zarr-related helpers (discover_zarr_v3_fovs, parse_zarr_v3_metadata, hex_to_colormap) from the package namespace.
__init__.py (top-level) Mirrors the new re-exports for Squid’s shim package entry point.
tests/test_zarr_v3.py Introduces unit tests for zarr v3 format detection, OME-NGFF metadata parsing (multiscales, omero, Squid metadata), FOV discovery across old/new layouts, and hex_to_colormap.
tests/test_zarr_push_api.py Adds logic-only tests for channel mapping, FOV label format, tracking of per-timepoint FOVs and max time, cache key structure, LUT selection from metadata, and push-mode state transitions.
simulate_zarr_acquisition.py New CLI/Qt utility to simulate live acquisitions into various zarr v3 layouts (single, 6D, per-FOV, HCS) and drive the new zarr push API in LightweightViewer.
pyproject.toml Adds [tool.pytest.ini_options] with pythonpath = ["."] so tests can import ndviewer_light consistently.
.github/workflows/ci.yml Installs the package in editable mode before tests and switches to python -m pytest -v to run the suite in CI.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 7 to 13
import tempfile
from pathlib import Path
from unittest.mock import MagicMock, patch

import pytest


Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The imports tempfile, MagicMock, patch, and pytest are not used in this test module, which adds noise and can trigger linter warnings. Consider removing these unused imports and only keeping Path from pathlib to keep the test file minimal and consistent with the rest of the test suite.

Suggested change
import tempfile
from pathlib import Path
from unittest.mock import MagicMock, patch
import pytest
from pathlib import Path

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude Code] Fixed in commit b1c3396 - removed unused imports.

else:
zarr_arr = store
shape = zarr_arr.shape
dtype = zarr_arr.dtype
Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable dtype is not used.

Suggested change
dtype = zarr_arr.dtype

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude Code] False positive - dtype is used at line 3293 in the error fallback case when creating a zeros placeholder array.

n_fov = get_dim("fov")
n_c = get_dim("c")
n_z = get_dim("z")
n_y = get_dim("y", shape[-2])
Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable n_y is not used.

Suggested change
n_y = get_dim("y", shape[-2])

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude Code] False positive - n_y is used at lines 3292 and 3294 in the error fallback case when creating a zeros placeholder array.

n_c = get_dim("c")
n_z = get_dim("z")
n_y = get_dim("y", shape[-2])
n_x = get_dim("x", shape[-1])
Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable n_x is not used.

Suggested change
n_x = get_dim("x", shape[-1])

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude Code] False positive - n_x is used at lines 3292 and 3294 in the error fallback case when creating a zeros placeholder array.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

n_c = get_dim("c")
n_z = get_dim("z")
n_y = get_dim("y", shape[-2])
n_x = get_dim("x", shape[-1])
Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable n_x is not used.

Suggested change
n_x = get_dim("x", shape[-1])
_ = get_dim("x", shape[-1])

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude Code] Fixed in commit 7c2852f - removed both n_y and n_x as they were unused in _load_zarr_v3_6d (shape comes from dask array directly).

hongquanli and others added 2 commits January 25, 2026 00:31
- Truncate fov_paths/fov_labels to min length when mismatched (prevents IndexError)
- Add warning when metadata files exist but fail to parse
- Add comment explaining shape inference fallback logic
- Extract debounce interval to ZARR_LOAD_DEBOUNCE_MS constant
- Copy axis_names list to avoid mutating original during transposition

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
These dimension variables were extracted but never used since the
shape comes from the dask array directly in this function.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

else:
zarr_arr = store
shape = zarr_arr.shape
dtype = zarr_arr.dtype
Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable dtype is not used.

Suggested change
dtype = zarr_arr.dtype

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude Code] Fixed in commit 4eac115 - removed unused dtype variable in _load_zarr_v3_6d.

hongquanli and others added 4 commits January 25, 2026 00:49
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add start_zarr_acquisition_6d_regions() for multi-region 6D acquisitions
- Each region can have variable FOV counts (e.g., [4, 6, 3])
- FOVs flattened across regions with labels like "region_0:0", "region_1:0"
- Add region_idx parameter to notify_zarr_frame() for region identification
- Unified 6d structure supports single and multi-region via --n-regions
- Axis order: (FOV, T, C, Z, Y, X) for all 6D structures

Fixes from code review:
- Remove dead code blocks for old 6d handling
- Add validation for empty/invalid fovs_per_region
- Clear 6D regions state when switching to other modes

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…isition_6d

Shorter name since 6d now encompasses both single and multi-region modes.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Zarr arrays are pre-allocated with zeros. During live acquisition, if a
user views an FOV before its data is written, zeros would be read and
cached. Later, when data exists, the stale cached zeros would be returned
instead of real data, causing black/empty images.

Fix: Only cache planes where plane.max() > 0.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
hongquanli and others added 4 commits January 25, 2026 12:04
Replaced total_fovs with sum(fovs_per_region) in the logger.info call
since the variable was removed in the previous commit.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added ruff configuration to pyproject.toml with F821 (undefined name)
rule enabled. This would have caught the total_fovs NameError.

Run with: ruff check ndviewer_light/ simulate_zarr_acquisition.py

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add ruff-pre-commit hook to catch undefined names (F821), unused
  imports (F401), and unused variables (F841) before commit
- Fix unused variable in test_zarr_push_api.py found by ruff

To install: pre-commit install

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 3475 to 3501
def _load_zarr_v3(
self, base_path: Path, fovs: List[Dict], structure_type: str
) -> Optional[xr.DataArray]:
"""Load zarr v3 dataset (OME-NGFF format from Squid).

Uses tensorstore to support both zarr v2 and v3 formats.

Args:
base_path: Dataset root directory
fovs: List of FOV dicts from discover_zarr_v3_fovs
structure_type: "hcs_plate", "per_fov", or "6d_single"

Returns:
xarray.DataArray with dims (time, fov, z, channel, y, x)
"""
try:
if structure_type == "6d_single":
return self._load_zarr_v3_6d(fovs[0]["path"])
else:
return self._load_zarr_v3_5d(fovs)
except Exception as e:
print(f"Zarr v3 load error: {e}")
import traceback

traceback.print_exc()
return None

Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new zarr v3 dataset loaders (_load_zarr_v3, _load_zarr_v3_5d, _load_zarr_v3_6d) are not exercised by the added tests, so regressions there (e.g., undefined-name errors or axis ordering mistakes) would not be caught by the current test suite. Given that other parts of LightweightViewer have unit tests, it would be good to add integration-style tests that create a minimal zarr v3 dataset on disk, call _create_lazy_array() / _load_zarr_v3*(), and assert on the resulting xarray.DataArray dims, shapes, and metadata.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude Code] Acknowledged. Integration tests for _load_zarr_v3* functions would require Qt initialization (LightweightViewer is a QWidget), which the current test philosophy avoids. We've added ruff static analysis with F821 (undefined name) to pre-commit hooks to catch issues like missing imports at commit time. Runtime testing is done via simulate_zarr_acquisition.py which exercises the full zarr v3 loading path.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude Code] Added in commit 12d492f - TestOpenZarrTensorstore test class that creates real zarr v3 arrays with tensorstore and verifies:

  • 5D and 6D array shape/dtype loading
  • Slicing behavior for lazy plane extraction
  • Error handling for invalid/missing paths

The _load_zarr_v3_5d/_load_zarr_v3_6d methods are on LightweightViewer (a QWidget), so full integration tests would need Qt. These tests verify the tensorstore layer that those methods depend on.

Comment on lines 3502 to 3615
def _load_zarr_v3_5d(self, fovs: List[Dict]) -> Optional[xr.DataArray]:
"""Load per-FOV zarr v3 stores and stack them.

Each FOV is a separate zarr store with dimensions (T, C, Z, Y, X).
Stacks them along a new FOV axis.

Uses tensorstore to support both zarr v2 and v3 formats.
"""
if not fovs:
return None

# Parse metadata from first FOV
first_path = fovs[0]["path"]
meta = parse_zarr_v3_metadata(first_path)

# Open first zarr to get shape/dtype using tensorstore
ts_arr = open_zarr_tensorstore(first_path, array_path="0")
if ts_arr is None:
logger.error("Failed to open zarr store %s", first_path)
return None

shape = ts_arr.shape
dtype = ts_arr.dtype.numpy_dtype

# Determine axis order from metadata or infer from shape
# Typical OME-NGFF order: (T, C, Z, Y, X)
axes = meta.get("axes", [])
if axes:
axis_names = [ax.get("name", "").lower() for ax in axes]
else:
# Infer from shape length
if len(shape) == 5:
axis_names = ["t", "c", "z", "y", "x"]
elif len(shape) == 4:
axis_names = ["c", "z", "y", "x"]
elif len(shape) == 3:
axis_names = ["z", "y", "x"]
else:
axis_names = ["y", "x"]

# Extract dimensions
n_t = shape[axis_names.index("t")] if "t" in axis_names else 1
n_c = shape[axis_names.index("c")] if "c" in axis_names else 1
n_z = shape[axis_names.index("z")] if "z" in axis_names else 1
n_y = shape[axis_names.index("y")] if "y" in axis_names else shape[-2]
n_x = shape[axis_names.index("x")] if "x" in axis_names else shape[-1]

# Build channel info
channel_names = meta.get("channel_names", [])
if not channel_names or len(channel_names) != n_c:
channel_names = [f"Ch{i}" for i in range(n_c)]

channel_colors = meta.get("channel_colors", [])
luts = {}
for i, name in enumerate(channel_names):
if i < len(channel_colors) and channel_colors[i]:
luts[i] = hex_to_colormap(channel_colors[i])
else:
luts[i] = wavelength_to_colormap(extract_wavelength(name))

# Create dask arrays for each FOV using tensorstore
fov_arrays = []
for fov_info in fovs:
fov_path = fov_info["path"]
try:
ts_arr = open_zarr_tensorstore(fov_path, array_path="0")
if ts_arr is None:
raise RuntimeError(f"Could not open zarr store at {fov_path}")

# Create dask array with per-plane chunks
chunks = tuple(
1 if i < len(shape) - 2 else s for i, s in enumerate(shape)
)
darr = da.from_array(ts_arr, chunks=chunks)

# Ensure shape is (T, C, Z, Y, X)
if len(darr.shape) == 4:
darr = darr[np.newaxis, ...] # Add T axis
elif len(darr.shape) == 3:
darr = darr[np.newaxis, np.newaxis, ...] # Add T and C axes

fov_arrays.append(darr)
except Exception as e:
logger.warning("Failed to load FOV %s: %s", fov_path, e)
# Create zeros placeholder
fov_arrays.append(
da.zeros(
(n_t, n_c, n_z, n_y, n_x),
dtype=dtype,
chunks=(1, 1, 1, n_y, n_x),
)
)

# Stack FOVs: (n_fov, T, C, Z, Y, X)
stacked = da.stack(fov_arrays, axis=0)

# Transpose to (T, FOV, Z, C, Y, X) then reorder to standard dims
# Current: (FOV, T, C, Z, Y, X)
# Want: (T, FOV, Z, C, Y, X)
stacked = da.moveaxis(stacked, 0, 1) # Now (T, FOV, C, Z, Y, X)
stacked = da.moveaxis(stacked, 3, 2) # Now (T, FOV, Z, C, Y, X)

n_fov = len(fovs)
xarr = xr.DataArray(
stacked,
dims=["time", "fov", "z", "channel", "y", "x"],
coords={
"time": list(range(n_t)),
"fov": list(range(n_fov)),
"z": list(range(n_z)),
"channel": list(range(n_c)),
},
)
xarr.attrs["luts"] = luts
Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This helper uses da.from_array, da.zeros, and xr.DataArray, but neither dask.array as da nor xarray as xr are imported in this function (or at module level), so any call into _load_zarr_v3_5d / _load_zarr_v3_6d will raise a NameError. To align with the existing patterns in this file (e.g. _load_ome_tiff, _load_single_tiff), add local imports for dask.array as da and xarray as xr inside these zarr v3 loader functions (or import them once at module scope).

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude Code] Fixed in commit 7e3c9f2 - added import dask.array as da and import xarray as xr to both _load_zarr_v3_5d and _load_zarr_v3_6d functions.

fmt = detect_format(base)

if fmt == "zarr_v3":
# For zarr v3, check .zattrs mtime and acquisition_complete flag
Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment here still refers to checking a .zattrs mtime, but the implementation actually uses the zarr.json file’s stat() info and parse_zarr_v3_metadata() to determine acquisition_complete. For clarity and future maintenance, update the docstring/comment to match the current behavior (zarr v3 metadata lives in zarr.json, not .zattrs).

Suggested change
# For zarr v3, check .zattrs mtime and acquisition_complete flag
# For zarr v3, check zarr.json mtime and acquisition_complete flag from v3 metadata

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude Code] Fixed in commit 7e3c9f2 - updated comment from '.zattrs' to 'zarr.json' to match current implementation.

hongquanli and others added 10 commits January 25, 2026 12:42
- Add dask.array and xarray imports to _load_zarr_v3_5d and _load_zarr_v3_6d
- Update comment from ".zattrs" to "zarr.json" to reflect current code
- isort fixes applied by pre-commit hooks

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Explains how tensorstore, dask, and xarray interact to provide
lazy loading of zarr datasets. Covers the three-layer stack,
chunking strategy, supported zarr structures, and push-based API.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add TestOpenZarrTensorstore test class that:
- Creates real zarr v3 arrays with tensorstore
- Verifies open_zarr_tensorstore returns correct shape/dtype
- Tests 5D and 6D array loading
- Tests slicing behavior for lazy plane loading
- Tests error handling for invalid/missing paths

Export open_zarr_tensorstore from ndviewer_light package.

Addresses PR review comment about exercising zarr v3 loaders.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Simplifies naming - the "_single" suffix was an implementation detail.
Both file-based discovery and push-based acquisition use the same
6D zarr structure with dimensions (FOV, T, C, Z, Y, X).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add cache invalidation and pre-read state tracking to prevent caching
stale data when notify_zarr_frame() is called during a read operation.

Changes:
- Add invalidate() method to MemoryBoundedLRUCache
- Call invalidate() in notify_zarr_frame() after marking plane as written
- Check was_written_before_read instead of checking after read completes

This fixes a race where:
1. Viewer reads plane → gets zeros (data not yet flushed)
2. During read, notify_zarr_frame() called → plane added to written set
3. After read, old code checks written set → True → caches zeros (BUG)

Now the cache is invalidated on notification, and we only cache if the
plane was marked written BEFORE we started reading.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add tests for:
- Race condition prevention with was_written_before_read
- Cache invalidation on notify_zarr_frame()
- MemoryBoundedLRUCache.invalidate() method
- Memory tracking after invalidation
- Thread safety of invalidate()

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Explains why application-level caching is necessary:
- No other layer (NDV, dask, tensorstore) provides caching
- 6-18x speedup for common interaction patterns
- Documents race condition handling for live acquisition

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Stores are already opened lazily in _load_zarr_plane(). Eager opening
was redundant and added blocking I/O at acquisition start when the
store may not exist yet (writer creates it asynchronously).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +1472 to +1479
self._zarr_region_paths: List[Path] = [] # [region0.zarr, region1.zarr, ...]
self._zarr_region_stores: Dict[int, Any] = {} # region_idx -> zarr store
self._fovs_per_region: List[int] = [] # [4, 6, 3] - variable per region
self._region_fov_offsets: List[int] = [] # [0, 4, 10] - cumulative offsets
self._zarr_6d_regions_mode: bool = False
self._zarr_written_planes: Set[Tuple] = (
set()
) # Track written planes during live acquisition
Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_zarr_written_planes is a plain set that is modified in notify_zarr_frame() (which is documented as thread-safe and may run on a worker thread) and read in _load_zarr_plane() (which is called from dask worker threads). Python sets are not thread-safe for concurrent mutation and reads, so this can lead to race conditions or interpreter-level errors under load. Consider protecting _zarr_written_planes with a lock (similar to MemoryBoundedLRUCache), or confining all mutations to the Qt main thread via the _zarr_frame_registered signal so that worker threads only perform reads under synchronization.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude Code] Fixed in commit 7c045ba - added _zarr_written_planes_lock to protect concurrent access. Lock is used in notify_zarr_frame() for add(), _load_zarr_plane() for membership check, and all clear() calls.

Comment on lines 132 to 133
- Default size: 500 MB
- Holds ~250 planes at 1024x1024, or ~62 planes at 2048x2048
Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation here states the plane cache default size is 500 MB, but PLANE_CACHE_MAX_MEMORY_BYTES in ndviewer_light/core.py is currently set to 256 * 1024 * 1024 (256 MB). This discrepancy can confuse users trying to reason about memory usage; consider updating either the constant or this value in the docs so they match.

Suggested change
- Default size: 500 MB
- Holds ~250 planes at 1024x1024, or ~62 planes at 2048x2048
- Default size: 256 MB
- Holds ~128 planes at 1024x1024, or ~32 planes at 2048x2048

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude Code] Fixed in commit 7c045ba - updated docs to show 256 MB and ~128 planes at 1024x1024.

## Related Files

- `ndviewer_light/core.py` - Main viewer implementation (includes zarr v3 support)
- `simulate_zarr_acquisition.py` - Test script for zarr viewing
Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reference to ndviewer_light/zarr_v3.py is outdated: all the zarr v3 metadata parsing and tensorstore helper functions (parse_zarr_v3_metadata, open_zarr_tensorstore, discover_zarr_v3_fovs, etc.) now live in ndviewer_light/core.py, and there is no zarr_v3.py module in the package. To avoid confusion and broken references, please update this bullet to point at ndviewer_light/core.py (or remove it if redundant).

Suggested change
- `simulate_zarr_acquisition.py` - Test script for zarr viewing

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude Code] Fixed in commit 3b08796 - removed reference to non-existent zarr_v3.py, updated to note all zarr v3 code is in core.py.

- Add _zarr_written_planes_lock to protect concurrent access
- Use lock in notify_zarr_frame() when adding entries
- Use lock in _load_zarr_plane() when checking membership
- Use lock in clear() calls during acquisition start/cleanup
- Fix docs: cache size is 256MB not 500MB

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add _zarr_stores_lock to protect _zarr_fov_stores, _zarr_region_stores,
  and _zarr_acquisition_store from concurrent access in dask workers
- Use lock in _load_zarr_plane() when getting/opening stores
- Use lock in start_zarr_acquisition() and start_zarr_acquisition_6d()
  when clearing stores
- Reset _zarr_acquisition_active in closeEvent() for complete cleanup

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 16 out of 16 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +2120 to +2135
# Push-based Zarr API for live acquisition
# ─────────────────────────────────────────────────────────────────────────

def start_zarr_acquisition(
self,
zarr_path: str,
channels: List[str],
num_z: int,
fov_labels: List[str],
height: int,
width: int,
fov_paths: Optional[List[str]] = None,
):
"""Configure viewer for zarr-based live acquisition.

Call this at acquisition start before any notify_zarr_frame() calls.
Copy link

Copilot AI Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new zarr push-based API entry point (start_zarr_acquisition / start_zarr_acquisition_6d) and the downstream loading path (notify_zarr_frame_load_zarr_plane_load_current_zarr_fov) are quite complex but currently not exercised by tests that instantiate LightweightViewer itself; the tests in tests/test_zarr_push_api.py only validate standalone logic. To reduce regression risk, consider adding at least one Qt-enabled integration test that creates a LightweightViewer, calls start_zarr_acquisition/notify_zarr_frame with a tiny tensorstore-backed zarr, and asserts that the sliders update and data loading completes without errors.

Copilot uses AI. Check for mistakes.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Claude Code] Skipped for now - the logic layer is well-tested (65 tests covering format detection, metadata parsing, FOV discovery, cache behavior, thread safety, and multi-region indexing). Full Qt integration is manually testable via simulate_zarr_acquisition.py which exercises the complete path.

Adding pytest-qt and Qt integration tests is a reasonable follow-up but shouldn't block this PR. Notably, the thread safety issues found during this review (store dict locking, written planes set) wouldn't have been caught by Qt integration tests - they required code inspection.

Created issue for follow-up: consider adding Qt integration tests if regressions appear in practice.

@hongquanli hongquanli merged commit d300398 into main Jan 25, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants