-
Notifications
You must be signed in to change notification settings - Fork 1
feat: Add zarr v3 reading support for OME-NGFF datasets #25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Add support for reading zarr v3 datasets in OME-NGFF format created by Squid. This enables ndviewer_light to open microscopy acquisitions saved in zarr v3 format with proper physical metadata, channel colors, and live acquisition support. Features: - Format detection for zarr v3 datasets (HCS plate, per-FOV, and 6D single store) - Metadata parsing from .zattrs files (OME-NGFF multiscales, omero channels, _squid_metadata) - FOV discovery for three zarr structures: - HCS plate: plate.zarr/row/col/field/acquisition.zarr - Per-FOV: zarr/region/fov_N.zarr - 6D single: zarr/region/acquisition.zarr - Hex color to colormap conversion for channel colors - Push-based zarr API for live acquisition viewing: - start_zarr_acquisition() / notify_zarr_frame() / end_zarr_acquisition() - Live refresh support via .zattrs mtime and acquisition_complete flag Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add simulate_zarr_acquisition.py for testing the push-based zarr API. Similar to simulate_push_acquisition.py but writes to a zarr store instead of individual TIFF files. Usage: python simulate_zarr_acquisition.py [output.zarr] python simulate_zarr_acquisition.py --n-t 10 --n-z 10 --channels DAPI GFP RFP Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update simulate_zarr_acquisition.py to support: - single: Simple 5D (T, C, Z, Y, X) - single region - 6d: 6D with FOV dimension (T, FOV, C, Z, Y, X) - per_fov: Separate zarr per FOV: zarr/region/fov_N.zarr - hcs: HCS plate format: plate.zarr/row/col/field/acquisition.zarr Usage: python simulate_zarr_acquisition.py --structure single python simulate_zarr_acquisition.py --structure 6d --n-fov 4 python simulate_zarr_acquisition.py --structure per_fov --n-fov 4 python simulate_zarr_acquisition.py --structure hcs --wells A1 A2 B1 B2 --fov-per-well 2 Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add _load_current_position() dispatch method that routes to: - _load_current_zarr_fov() when zarr acquisition is active - _load_current_fov() for TIFF mode This fixes T and FOV sliders not updating the display during zarr acquisition simulation. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add fov_paths parameter to start_zarr_acquisition() for per_fov/hcs structures where each FOV has its own zarr store - Update _load_zarr_plane to load from correct per-FOV store - Hide FOV slider when there's only a single FOV - Change simulation to write one plane at a time (0.5s default interval) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Review fixes: - Add cleanup of _zarr_fov_stores and _zarr_fov_paths in closeEvent - Improve error logging: escalate config errors to warning, keep transient errors (file not found) at debug level - Remove unused variable use_fov_in_index - Add validation for fov_paths/fov_labels length mismatch - Add error handling in simulation _finish() method - Restructure per-FOV store opening logic for clarity Also change default simulation interval from 0.5s to 0.05s per plane. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add backward-compatible support for both old and new Squid zarr formats:
Metadata parsing (parse_zarr_v3_metadata):
- Read from zarr.json -> attributes (new) in addition to .zattrs (old)
- Handle ome.multiscales (new) and multiscales (old) locations
- Handle ome.omero (new) and omero (old) locations
- Handle _squid (new) and _squid_metadata (old) keys
Format detection (detect_format):
- Detect plate.ome.zarr in addition to plate.zarr
- Detect .ome.zarr extension for per-FOV stores
FOV discovery (discover_zarr_v3_fovs):
- Support new HCS path: plate.ome.zarr/{row}/{col}/{fov}/0
- Support new per-FOV path: zarr/{region}/fov_{n}.ome.zarr
- Handle "0" array subdirectory (new format indicator)
Live refresh (_dataset_signature):
- Check zarr.json mtime in addition to .zattrs
Add 6 new tests for the new format in TestNewSquidFormat class.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The test imports were failing because the ndviewer_light package was not installed. Add pip install -e . to install it. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Running pytest directly can cause import issues when the package is installed in editable mode. Using python -m pytest ensures the package is properly importable. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Pytest was finding the local ndviewer_light/ directory instead of the installed package. Adding pythonpath = ["."] to pytest config ensures imports resolve correctly. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The root __init__.py is a shim for Squid submodule usage. It was missing the new zarr v3 exports (discover_zarr_v3_fovs, hex_to_colormap, parse_zarr_v3_metadata) which caused import failures on CI. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds OME-NGFF zarr v3 dataset support (including Squid’s formats) and a push-based zarr live acquisition API to ndviewer_light, along with tests, a simulation script, and CI/test harness updates.
Changes:
- Extend core format detection, metadata parsing, and FOV discovery to handle zarr v3 (old and new Squid layouts), and wire this into dataset loading and live-refresh logic.
- Add a push-based zarr acquisition API to
LightweightViewer(start_zarr_acquisition,notify_zarr_frame,end_zarr_acquisition) with debounced updates and per-FOV / 6D support, plus a Qt-based simulation script. - Add comprehensive tests for zarr v3 format/metadata/FOV handling and zarr push API logic, and adjust packaging/CI so tests run against the installed package.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
ndviewer_light/core.py |
Adds zarr v3 format detection, metadata parsing (parse_zarr_v3_metadata), FOV discovery (discover_zarr_v3_fovs), hex-to-colormap mapping, zarr v3 dataset loading (_load_zarr_v3_*), and a full push-based zarr acquisition pipeline integrated with the existing viewer. |
ndviewer_light/__init__.py |
Re-exports the new zarr-related helpers (discover_zarr_v3_fovs, parse_zarr_v3_metadata, hex_to_colormap) from the package namespace. |
__init__.py (top-level) |
Mirrors the new re-exports for Squid’s shim package entry point. |
tests/test_zarr_v3.py |
Introduces unit tests for zarr v3 format detection, OME-NGFF metadata parsing (multiscales, omero, Squid metadata), FOV discovery across old/new layouts, and hex_to_colormap. |
tests/test_zarr_push_api.py |
Adds logic-only tests for channel mapping, FOV label format, tracking of per-timepoint FOVs and max time, cache key structure, LUT selection from metadata, and push-mode state transitions. |
simulate_zarr_acquisition.py |
New CLI/Qt utility to simulate live acquisitions into various zarr v3 layouts (single, 6D, per-FOV, HCS) and drive the new zarr push API in LightweightViewer. |
pyproject.toml |
Adds [tool.pytest.ini_options] with pythonpath = ["."] so tests can import ndviewer_light consistently. |
.github/workflows/ci.yml |
Installs the package in editable mode before tests and switches to python -m pytest -v to run the suite in CI. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
tests/test_zarr_push_api.py
Outdated
| import tempfile | ||
| from pathlib import Path | ||
| from unittest.mock import MagicMock, patch | ||
|
|
||
| import pytest | ||
|
|
||
|
|
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The imports tempfile, MagicMock, patch, and pytest are not used in this test module, which adds noise and can trigger linter warnings. Consider removing these unused imports and only keeping Path from pathlib to keep the test file minimal and consistent with the rest of the test suite.
| import tempfile | |
| from pathlib import Path | |
| from unittest.mock import MagicMock, patch | |
| import pytest | |
| from pathlib import Path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Claude Code] Fixed in commit b1c3396 - removed unused imports.
ndviewer_light/core.py
Outdated
| else: | ||
| zarr_arr = store | ||
| shape = zarr_arr.shape | ||
| dtype = zarr_arr.dtype |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable dtype is not used.
| dtype = zarr_arr.dtype |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Claude Code] False positive - dtype is used at line 3293 in the error fallback case when creating a zeros placeholder array.
ndviewer_light/core.py
Outdated
| n_fov = get_dim("fov") | ||
| n_c = get_dim("c") | ||
| n_z = get_dim("z") | ||
| n_y = get_dim("y", shape[-2]) |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable n_y is not used.
| n_y = get_dim("y", shape[-2]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Claude Code] False positive - n_y is used at lines 3292 and 3294 in the error fallback case when creating a zeros placeholder array.
ndviewer_light/core.py
Outdated
| n_c = get_dim("c") | ||
| n_z = get_dim("z") | ||
| n_y = get_dim("y", shape[-2]) | ||
| n_x = get_dim("x", shape[-1]) |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable n_x is not used.
| n_x = get_dim("x", shape[-1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Claude Code] False positive - n_x is used at lines 3292 and 3294 in the error fallback case when creating a zeros placeholder array.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
ndviewer_light/core.py
Outdated
| n_c = get_dim("c") | ||
| n_z = get_dim("z") | ||
| n_y = get_dim("y", shape[-2]) | ||
| n_x = get_dim("x", shape[-1]) |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable n_x is not used.
| n_x = get_dim("x", shape[-1]) | |
| _ = get_dim("x", shape[-1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Claude Code] Fixed in commit 7c2852f - removed both n_y and n_x as they were unused in _load_zarr_v3_6d (shape comes from dask array directly).
- Truncate fov_paths/fov_labels to min length when mismatched (prevents IndexError) - Add warning when metadata files exist but fail to parse - Add comment explaining shape inference fallback logic - Extract debounce interval to ZARR_LOAD_DEBOUNCE_MS constant - Copy axis_names list to avoid mutating original during transposition Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
These dimension variables were extracted but never used since the shape comes from the dask array directly in this function. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 8 out of 8 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
ndviewer_light/core.py
Outdated
| else: | ||
| zarr_arr = store | ||
| shape = zarr_arr.shape | ||
| dtype = zarr_arr.dtype |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable dtype is not used.
| dtype = zarr_arr.dtype |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Claude Code] Fixed in commit 4eac115 - removed unused dtype variable in _load_zarr_v3_6d.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add start_zarr_acquisition_6d_regions() for multi-region 6D acquisitions - Each region can have variable FOV counts (e.g., [4, 6, 3]) - FOVs flattened across regions with labels like "region_0:0", "region_1:0" - Add region_idx parameter to notify_zarr_frame() for region identification - Unified 6d structure supports single and multi-region via --n-regions - Axis order: (FOV, T, C, Z, Y, X) for all 6D structures Fixes from code review: - Remove dead code blocks for old 6d handling - Add validation for empty/invalid fovs_per_region - Clear 6D regions state when switching to other modes Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…isition_6d Shorter name since 6d now encompasses both single and multi-region modes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Zarr arrays are pre-allocated with zeros. During live acquisition, if a user views an FOV before its data is written, zeros would be read and cached. Later, when data exists, the stale cached zeros would be returned instead of real data, causing black/empty images. Fix: Only cache planes where plane.max() > 0. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replaced total_fovs with sum(fovs_per_region) in the logger.info call since the variable was removed in the previous commit. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added ruff configuration to pyproject.toml with F821 (undefined name) rule enabled. This would have caught the total_fovs NameError. Run with: ruff check ndviewer_light/ simulate_zarr_acquisition.py Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add ruff-pre-commit hook to catch undefined names (F821), unused imports (F401), and unused variables (F841) before commit - Fix unused variable in test_zarr_push_api.py found by ruff To install: pre-commit install Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
3215b3a to
1876c92
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 10 out of 10 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def _load_zarr_v3( | ||
| self, base_path: Path, fovs: List[Dict], structure_type: str | ||
| ) -> Optional[xr.DataArray]: | ||
| """Load zarr v3 dataset (OME-NGFF format from Squid). | ||
|
|
||
| Uses tensorstore to support both zarr v2 and v3 formats. | ||
|
|
||
| Args: | ||
| base_path: Dataset root directory | ||
| fovs: List of FOV dicts from discover_zarr_v3_fovs | ||
| structure_type: "hcs_plate", "per_fov", or "6d_single" | ||
|
|
||
| Returns: | ||
| xarray.DataArray with dims (time, fov, z, channel, y, x) | ||
| """ | ||
| try: | ||
| if structure_type == "6d_single": | ||
| return self._load_zarr_v3_6d(fovs[0]["path"]) | ||
| else: | ||
| return self._load_zarr_v3_5d(fovs) | ||
| except Exception as e: | ||
| print(f"Zarr v3 load error: {e}") | ||
| import traceback | ||
|
|
||
| traceback.print_exc() | ||
| return None | ||
|
|
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new zarr v3 dataset loaders (_load_zarr_v3, _load_zarr_v3_5d, _load_zarr_v3_6d) are not exercised by the added tests, so regressions there (e.g., undefined-name errors or axis ordering mistakes) would not be caught by the current test suite. Given that other parts of LightweightViewer have unit tests, it would be good to add integration-style tests that create a minimal zarr v3 dataset on disk, call _create_lazy_array() / _load_zarr_v3*(), and assert on the resulting xarray.DataArray dims, shapes, and metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Claude Code] Acknowledged. Integration tests for _load_zarr_v3* functions would require Qt initialization (LightweightViewer is a QWidget), which the current test philosophy avoids. We've added ruff static analysis with F821 (undefined name) to pre-commit hooks to catch issues like missing imports at commit time. Runtime testing is done via simulate_zarr_acquisition.py which exercises the full zarr v3 loading path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Claude Code] Added in commit 12d492f - TestOpenZarrTensorstore test class that creates real zarr v3 arrays with tensorstore and verifies:
- 5D and 6D array shape/dtype loading
- Slicing behavior for lazy plane extraction
- Error handling for invalid/missing paths
The _load_zarr_v3_5d/_load_zarr_v3_6d methods are on LightweightViewer (a QWidget), so full integration tests would need Qt. These tests verify the tensorstore layer that those methods depend on.
| def _load_zarr_v3_5d(self, fovs: List[Dict]) -> Optional[xr.DataArray]: | ||
| """Load per-FOV zarr v3 stores and stack them. | ||
|
|
||
| Each FOV is a separate zarr store with dimensions (T, C, Z, Y, X). | ||
| Stacks them along a new FOV axis. | ||
|
|
||
| Uses tensorstore to support both zarr v2 and v3 formats. | ||
| """ | ||
| if not fovs: | ||
| return None | ||
|
|
||
| # Parse metadata from first FOV | ||
| first_path = fovs[0]["path"] | ||
| meta = parse_zarr_v3_metadata(first_path) | ||
|
|
||
| # Open first zarr to get shape/dtype using tensorstore | ||
| ts_arr = open_zarr_tensorstore(first_path, array_path="0") | ||
| if ts_arr is None: | ||
| logger.error("Failed to open zarr store %s", first_path) | ||
| return None | ||
|
|
||
| shape = ts_arr.shape | ||
| dtype = ts_arr.dtype.numpy_dtype | ||
|
|
||
| # Determine axis order from metadata or infer from shape | ||
| # Typical OME-NGFF order: (T, C, Z, Y, X) | ||
| axes = meta.get("axes", []) | ||
| if axes: | ||
| axis_names = [ax.get("name", "").lower() for ax in axes] | ||
| else: | ||
| # Infer from shape length | ||
| if len(shape) == 5: | ||
| axis_names = ["t", "c", "z", "y", "x"] | ||
| elif len(shape) == 4: | ||
| axis_names = ["c", "z", "y", "x"] | ||
| elif len(shape) == 3: | ||
| axis_names = ["z", "y", "x"] | ||
| else: | ||
| axis_names = ["y", "x"] | ||
|
|
||
| # Extract dimensions | ||
| n_t = shape[axis_names.index("t")] if "t" in axis_names else 1 | ||
| n_c = shape[axis_names.index("c")] if "c" in axis_names else 1 | ||
| n_z = shape[axis_names.index("z")] if "z" in axis_names else 1 | ||
| n_y = shape[axis_names.index("y")] if "y" in axis_names else shape[-2] | ||
| n_x = shape[axis_names.index("x")] if "x" in axis_names else shape[-1] | ||
|
|
||
| # Build channel info | ||
| channel_names = meta.get("channel_names", []) | ||
| if not channel_names or len(channel_names) != n_c: | ||
| channel_names = [f"Ch{i}" for i in range(n_c)] | ||
|
|
||
| channel_colors = meta.get("channel_colors", []) | ||
| luts = {} | ||
| for i, name in enumerate(channel_names): | ||
| if i < len(channel_colors) and channel_colors[i]: | ||
| luts[i] = hex_to_colormap(channel_colors[i]) | ||
| else: | ||
| luts[i] = wavelength_to_colormap(extract_wavelength(name)) | ||
|
|
||
| # Create dask arrays for each FOV using tensorstore | ||
| fov_arrays = [] | ||
| for fov_info in fovs: | ||
| fov_path = fov_info["path"] | ||
| try: | ||
| ts_arr = open_zarr_tensorstore(fov_path, array_path="0") | ||
| if ts_arr is None: | ||
| raise RuntimeError(f"Could not open zarr store at {fov_path}") | ||
|
|
||
| # Create dask array with per-plane chunks | ||
| chunks = tuple( | ||
| 1 if i < len(shape) - 2 else s for i, s in enumerate(shape) | ||
| ) | ||
| darr = da.from_array(ts_arr, chunks=chunks) | ||
|
|
||
| # Ensure shape is (T, C, Z, Y, X) | ||
| if len(darr.shape) == 4: | ||
| darr = darr[np.newaxis, ...] # Add T axis | ||
| elif len(darr.shape) == 3: | ||
| darr = darr[np.newaxis, np.newaxis, ...] # Add T and C axes | ||
|
|
||
| fov_arrays.append(darr) | ||
| except Exception as e: | ||
| logger.warning("Failed to load FOV %s: %s", fov_path, e) | ||
| # Create zeros placeholder | ||
| fov_arrays.append( | ||
| da.zeros( | ||
| (n_t, n_c, n_z, n_y, n_x), | ||
| dtype=dtype, | ||
| chunks=(1, 1, 1, n_y, n_x), | ||
| ) | ||
| ) | ||
|
|
||
| # Stack FOVs: (n_fov, T, C, Z, Y, X) | ||
| stacked = da.stack(fov_arrays, axis=0) | ||
|
|
||
| # Transpose to (T, FOV, Z, C, Y, X) then reorder to standard dims | ||
| # Current: (FOV, T, C, Z, Y, X) | ||
| # Want: (T, FOV, Z, C, Y, X) | ||
| stacked = da.moveaxis(stacked, 0, 1) # Now (T, FOV, C, Z, Y, X) | ||
| stacked = da.moveaxis(stacked, 3, 2) # Now (T, FOV, Z, C, Y, X) | ||
|
|
||
| n_fov = len(fovs) | ||
| xarr = xr.DataArray( | ||
| stacked, | ||
| dims=["time", "fov", "z", "channel", "y", "x"], | ||
| coords={ | ||
| "time": list(range(n_t)), | ||
| "fov": list(range(n_fov)), | ||
| "z": list(range(n_z)), | ||
| "channel": list(range(n_c)), | ||
| }, | ||
| ) | ||
| xarr.attrs["luts"] = luts |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This helper uses da.from_array, da.zeros, and xr.DataArray, but neither dask.array as da nor xarray as xr are imported in this function (or at module level), so any call into _load_zarr_v3_5d / _load_zarr_v3_6d will raise a NameError. To align with the existing patterns in this file (e.g. _load_ome_tiff, _load_single_tiff), add local imports for dask.array as da and xarray as xr inside these zarr v3 loader functions (or import them once at module scope).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Claude Code] Fixed in commit 7e3c9f2 - added import dask.array as da and import xarray as xr to both _load_zarr_v3_5d and _load_zarr_v3_6d functions.
ndviewer_light/core.py
Outdated
| fmt = detect_format(base) | ||
|
|
||
| if fmt == "zarr_v3": | ||
| # For zarr v3, check .zattrs mtime and acquisition_complete flag |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment here still refers to checking a .zattrs mtime, but the implementation actually uses the zarr.json file’s stat() info and parse_zarr_v3_metadata() to determine acquisition_complete. For clarity and future maintenance, update the docstring/comment to match the current behavior (zarr v3 metadata lives in zarr.json, not .zattrs).
| # For zarr v3, check .zattrs mtime and acquisition_complete flag | |
| # For zarr v3, check zarr.json mtime and acquisition_complete flag from v3 metadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Claude Code] Fixed in commit 7e3c9f2 - updated comment from '.zattrs' to 'zarr.json' to match current implementation.
- Add dask.array and xarray imports to _load_zarr_v3_5d and _load_zarr_v3_6d - Update comment from ".zattrs" to "zarr.json" to reflect current code - isort fixes applied by pre-commit hooks Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Explains how tensorstore, dask, and xarray interact to provide lazy loading of zarr datasets. Covers the three-layer stack, chunking strategy, supported zarr structures, and push-based API. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add TestOpenZarrTensorstore test class that: - Creates real zarr v3 arrays with tensorstore - Verifies open_zarr_tensorstore returns correct shape/dtype - Tests 5D and 6D array loading - Tests slicing behavior for lazy plane loading - Tests error handling for invalid/missing paths Export open_zarr_tensorstore from ndviewer_light package. Addresses PR review comment about exercising zarr v3 loaders. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Simplifies naming - the "_single" suffix was an implementation detail. Both file-based discovery and push-based acquisition use the same 6D zarr structure with dimensions (FOV, T, C, Z, Y, X). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add cache invalidation and pre-read state tracking to prevent caching stale data when notify_zarr_frame() is called during a read operation. Changes: - Add invalidate() method to MemoryBoundedLRUCache - Call invalidate() in notify_zarr_frame() after marking plane as written - Check was_written_before_read instead of checking after read completes This fixes a race where: 1. Viewer reads plane → gets zeros (data not yet flushed) 2. During read, notify_zarr_frame() called → plane added to written set 3. After read, old code checks written set → True → caches zeros (BUG) Now the cache is invalidated on notification, and we only cache if the plane was marked written BEFORE we started reading. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add tests for: - Race condition prevention with was_written_before_read - Cache invalidation on notify_zarr_frame() - MemoryBoundedLRUCache.invalidate() method - Memory tracking after invalidation - Thread safety of invalidate() Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Explains why application-level caching is necessary: - No other layer (NDV, dask, tensorstore) provides caching - 6-18x speedup for common interaction patterns - Documents race condition handling for live acquisition Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Stores are already opened lazily in _load_zarr_plane(). Eager opening was redundant and added blocking I/O at acquisition start when the store may not exist yet (writer creates it asynchronously). Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| self._zarr_region_paths: List[Path] = [] # [region0.zarr, region1.zarr, ...] | ||
| self._zarr_region_stores: Dict[int, Any] = {} # region_idx -> zarr store | ||
| self._fovs_per_region: List[int] = [] # [4, 6, 3] - variable per region | ||
| self._region_fov_offsets: List[int] = [] # [0, 4, 10] - cumulative offsets | ||
| self._zarr_6d_regions_mode: bool = False | ||
| self._zarr_written_planes: Set[Tuple] = ( | ||
| set() | ||
| ) # Track written planes during live acquisition |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_zarr_written_planes is a plain set that is modified in notify_zarr_frame() (which is documented as thread-safe and may run on a worker thread) and read in _load_zarr_plane() (which is called from dask worker threads). Python sets are not thread-safe for concurrent mutation and reads, so this can lead to race conditions or interpreter-level errors under load. Consider protecting _zarr_written_planes with a lock (similar to MemoryBoundedLRUCache), or confining all mutations to the Qt main thread via the _zarr_frame_registered signal so that worker threads only perform reads under synchronization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Claude Code] Fixed in commit 7c045ba - added _zarr_written_planes_lock to protect concurrent access. Lock is used in notify_zarr_frame() for add(), _load_zarr_plane() for membership check, and all clear() calls.
docs/zarr_viewer_architecture.md
Outdated
| - Default size: 500 MB | ||
| - Holds ~250 planes at 1024x1024, or ~62 planes at 2048x2048 |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The documentation here states the plane cache default size is 500 MB, but PLANE_CACHE_MAX_MEMORY_BYTES in ndviewer_light/core.py is currently set to 256 * 1024 * 1024 (256 MB). This discrepancy can confuse users trying to reason about memory usage; consider updating either the constant or this value in the docs so they match.
| - Default size: 500 MB | |
| - Holds ~250 planes at 1024x1024, or ~62 planes at 2048x2048 | |
| - Default size: 256 MB | |
| - Holds ~128 planes at 1024x1024, or ~32 planes at 2048x2048 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Claude Code] Fixed in commit 7c045ba - updated docs to show 256 MB and ~128 planes at 1024x1024.
| ## Related Files | ||
|
|
||
| - `ndviewer_light/core.py` - Main viewer implementation (includes zarr v3 support) | ||
| - `simulate_zarr_acquisition.py` - Test script for zarr viewing |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reference to ndviewer_light/zarr_v3.py is outdated: all the zarr v3 metadata parsing and tensorstore helper functions (parse_zarr_v3_metadata, open_zarr_tensorstore, discover_zarr_v3_fovs, etc.) now live in ndviewer_light/core.py, and there is no zarr_v3.py module in the package. To avoid confusion and broken references, please update this bullet to point at ndviewer_light/core.py (or remove it if redundant).
| - `simulate_zarr_acquisition.py` - Test script for zarr viewing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Claude Code] Fixed in commit 3b08796 - removed reference to non-existent zarr_v3.py, updated to note all zarr v3 code is in core.py.
- Add _zarr_written_planes_lock to protect concurrent access - Use lock in notify_zarr_frame() when adding entries - Use lock in _load_zarr_plane() when checking membership - Use lock in clear() calls during acquisition start/cleanup - Fix docs: cache size is 256MB not 500MB Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add _zarr_stores_lock to protect _zarr_fov_stores, _zarr_region_stores, and _zarr_acquisition_store from concurrent access in dask workers - Use lock in _load_zarr_plane() when getting/opening stores - Use lock in start_zarr_acquisition() and start_zarr_acquisition_6d() when clearing stores - Reset _zarr_acquisition_active in closeEvent() for complete cleanup Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 16 out of 16 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Push-based Zarr API for live acquisition | ||
| # ───────────────────────────────────────────────────────────────────────── | ||
|
|
||
| def start_zarr_acquisition( | ||
| self, | ||
| zarr_path: str, | ||
| channels: List[str], | ||
| num_z: int, | ||
| fov_labels: List[str], | ||
| height: int, | ||
| width: int, | ||
| fov_paths: Optional[List[str]] = None, | ||
| ): | ||
| """Configure viewer for zarr-based live acquisition. | ||
|
|
||
| Call this at acquisition start before any notify_zarr_frame() calls. |
Copilot
AI
Jan 25, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new zarr push-based API entry point (start_zarr_acquisition / start_zarr_acquisition_6d) and the downstream loading path (notify_zarr_frame → _load_zarr_plane → _load_current_zarr_fov) are quite complex but currently not exercised by tests that instantiate LightweightViewer itself; the tests in tests/test_zarr_push_api.py only validate standalone logic. To reduce regression risk, consider adding at least one Qt-enabled integration test that creates a LightweightViewer, calls start_zarr_acquisition/notify_zarr_frame with a tiny tensorstore-backed zarr, and asserts that the sliders update and data loading completes without errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Claude Code] Skipped for now - the logic layer is well-tested (65 tests covering format detection, metadata parsing, FOV discovery, cache behavior, thread safety, and multi-region indexing). Full Qt integration is manually testable via simulate_zarr_acquisition.py which exercises the complete path.
Adding pytest-qt and Qt integration tests is a reasonable follow-up but shouldn't block this PR. Notably, the thread safety issues found during this review (store dict locking, written planes set) wouldn't have been caught by Qt integration tests - they required code inspection.
Created issue for follow-up: consider adding Qt integration tests if regressions appear in practice.
Summary
Features
Tensorstore Backend
File-based Loading
Format detection: Automatically detects zarr v3 datasets by checking for:
plate.zarr/orplate.ome.zarr/directory (HCS plate format)zarr/directory with.zarror.ome.zarrsubdirectories.zarrdirectories withzarr.jsonMetadata parsing: Extracts from
zarr.json:ome.multiscales[0].axesandcoordinateTransformations(physical pixel sizes)ome.omero.channels(channel names and hex colors)_squid(Squid-specific fields: pixel_size_um, z_step_um, acquisition_complete)FOV discovery: Supports multiple zarr structures:
plate.ome.zarr/{row}/{col}/{fov}/0zarr/region/fov_N.zarrorfov_N.ome.zarrzarr/region/acquisition.zarrwith shape (FOV, T, C, Z, Y, X)Push-based Zarr API (Live Acquisition)
start_zarr_acquisition()- for per_fov/hcs structuresstart_zarr_acquisition_6d()- for multi-region 6D structures with variable FOVs per regionnotify_zarr_frame(t, fov_idx, z, channel, region_idx)- thread-safe notificationend_zarr_acquisition()was_written_before_readpatternThread Safety
_zarr_written_planes_lockprotects the written planes set_zarr_stores_lockprotects lazy-loaded store dictsNew Exports
open_zarr_tensorstore()parse_zarr_v3_metadata()discover_zarr_v3_fovs()hex_to_colormap()Test plan
pytest tests/test_zarr_v3.py tests/test_zarr_push_api.py -v(65 tests)pytest tests/ -v(211 tests)black --check .Testing the Push API
Use the simulation script to test the push-based zarr API:
Documentation
docs/zarr_viewer_architecture.md- Architecture overview with tensorstore/dask/xarray stack🤖 Generated with Claude Code