Basic Usage#

This example demonstrates the fundamental operations for reading OME-Zarr files with xarray-ome.

Opening OME-Zarr Files#

There are two main ways to open OME-Zarr files:

  1. As a DataTree - contains all resolution levels

  2. As a Dataset - contains a single resolution level

import xarray as xr
from xarray_ome import open_ome_dataset, open_ome_datatree

# Sample data from the Image Data Resource
url = "https://uk1s3.embassy.ebi.ac.uk/idr/zarr/v0.4/idr0062A/6001240.zarr"

Opening as Dataset#

Open a single resolution level (default is highest resolution):

# Open highest resolution (default)
ds = open_ome_dataset(url)

print("Dataset dimensions:", dict(ds.sizes))
print("Data variables:", list(ds.data_vars.keys()))
print("Coordinate names:", list(ds.coords.keys()))
Dataset dimensions: {'c': 2, 'z': 236, 'y': 275, 'x': 271}
Data variables: ['image']
Coordinate names: ['c', 'z', 'y', 'x']

Exploring Coordinates#

OME-NGFF coordinate transformations are converted to xarray coordinates:

# Show coordinate details
print("\nCoordinate ranges:")
for name, coord in ds.coords.items():
    print(f"{name}: {coord.shape[0]} points")
    if name in ['x', 'y', 'z']:
        unit = ds.attrs['ome_axes_units'].get(name, 'unknown')
        print(f"  Range: [{float(coord.min()):.2f}, {float(coord.max()):.2f}] {unit}")
Coordinate ranges:
c: 2 points
z: 236 points
  Range: [0.00, 117.55] micrometer
y: 275 points
  Range: [0.00, 98.75] micrometer
x: 271 points
  Range: [0.00, 97.31] micrometer

Opening as DataTree#

Open all resolution levels in the multiscale pyramid:

# Open as DataTree to get all resolutions
dt = open_ome_datatree(url)

print("\nDataTree structure:")
print(f"Root node: {dt.name}")
print(f"Children: {list(dt.children.keys())}")

# Show shape of each resolution level
print("\nResolution levels:")
for name, child in dt.children.items():
    shape = child.ds['image'].shape
    print(f"  {name}: {shape}")
DataTree structure:
Root node: root
Children: ['scale0', 'scale1', 'scale2']

Resolution levels:
  scale0: (2, 236, 275, 271)
  scale1: (2, 236, 137, 135)
  scale2: (2, 236, 68, 67)

Accessing Different Resolutions#

# Access highest resolution from DataTree
high_res = dt["scale0"].ds
print(f"Highest resolution: {high_res['image'].shape}")

# Access lower resolution for quick previews
low_res = dt["scale2"].ds
print(f"Lowest resolution: {low_res['image'].shape}")
Highest resolution: (2, 236, 275, 271)
Lowest resolution: (2, 236, 68, 67)

Using xarray’s Native Backend#

You can also use xarray’s native functions with the engine="ome-zarr" parameter:

# Using xarray's native functions
ds_native = xr.open_dataset(url, engine="ome-zarr")
dt_native = xr.open_datatree(url, engine="ome-zarr")

print(f"\nUsing native xarray backend:")
print(f"Dataset: {ds_native['image'].shape}")
print(f"DataTree: {list(dt_native.children.keys())}")
Using native xarray backend:
Dataset: (2, 236, 275, 271)
DataTree: ['scale0', 'scale1', 'scale2']

Lazy Loading#

Data is loaded lazily with Dask - only metadata is read initially:

import dask

print(f"Data type: {type(ds['image'].data)}")
print(f"Is Dask array: {isinstance(ds['image'].data, dask.array.Array)}")

# Data is only loaded when .compute() is called
print(f"\nChunk size: {ds['image'].data.chunksize}")
Data type: <class 'dask.array.core.Array'>
Is Dask array: True

Chunk size: (1, 1, 275, 271)

Selecting Subsets#

Use xarray’s powerful indexing to work with subsets:

# Select single channel
channel_0 = ds.sel(c='LaminB1')
print(f"Single channel shape: {channel_0['image'].shape}")

# Select z-slice
z_slice = ds.isel(z=100)
print(f"Z-slice shape: {z_slice['image'].shape}")

# Combine selections
subset = ds.sel(c='Dapi').isel(z=slice(0, 10))
print(f"Subset shape: {subset['image'].shape}")
Single channel shape: (236, 275, 271)
Z-slice shape: (2, 275, 271)
Subset shape: (10, 275, 271)

Computing Results#

Load data into memory when needed:

# Create a maximum intensity projection (lazy operation)
mip = ds['image'].sel(c='LaminB1').max(dim='z')
print(f"MIP (lazy): {type(mip.data)}")

# Compute the result
mip_computed = mip.compute()
print(f"MIP (computed): {type(mip_computed.data)}")
print(f"MIP shape: {mip_computed.shape}")
MIP (lazy): <class 'dask.array.core.Array'>
MIP (computed): <class 'numpy.ndarray'>
MIP shape: (275, 271)

Metadata Access#

All OME-NGFF metadata is preserved in attributes:

print("\nMetadata attributes:")
print(f"  Image name: {ds.attrs.get('ome_image_name')}")
print(f"  OME-NGFF version: {ds.attrs['ome_ngff_metadata']['version']}")
print(f"  Axes units: {ds.attrs['ome_axes_units']}")
print(f"  Scale factors: {ds.attrs['ome_scale']}")
Metadata attributes:
  Image name: None
  OME-NGFF version: 0.4
  Axes units: {'z': 'micrometer', 'y': 'micrometer', 'x': 'micrometer'}
  Scale factors: {'c': 1.0, 'z': 0.5002025531914894, 'y': 0.3603981534640209, 'x': 0.3603981534640209}