OME-NGFF to xarray Mapping#
This page documents how each component of the OME-NGFF metadata specification is mapped to xarray data structures.
Design Philosophy#
Core Principle: Metadata that can be represented in xarray’s native data model (coordinates, dimension names) is stored there, not duplicated in attrs.
xarray coordinates: Represent actual data (physical positions, channel labels)
xarray dimensions: Represent axis names
xarray attrs: Store metadata that has no native xarray representation
This design ensures:
Natural xarray workflows (
.sel(),.isel(), coordinate-based indexing)No redundancy between coords and attrs
Full round-trip fidelity via preserved metadata dict
Metadata Mapping Reference#
axes#
OME-NGFF Spec: §2.1 Axes
OME-NGFF axes metadata describes the dimensions of the array.
OME-NGFF Field |
xarray Location |
Notes |
|---|---|---|
|
Dataset.dims |
Axis names become dimension names (e.g., |
|
|
List of types (e.g., |
|
|
Dict mapping axis name to unit (e.g., |
Example:
ds = xr.open_dataset("image.ome.zarr", engine="ome-zarr")
# Axis names → dimensions
print(ds.dims) # {'c': 2, 'z': 236, 'y': 275, 'x': 271}
# Axis types → attrs
print(ds.attrs['ome_axes_types']) # ['channel', 'space', 'space', 'space']
# Axis units → attrs
print(ds.attrs['ome_axes_units']) # {'z': 'micrometer', 'y': 'micrometer', 'x': 'micrometer'}
coordinateTransformations#
OME-NGFF Spec: §2.3 Coordinate Transformations
Coordinate transformations define the mapping from array indices to physical coordinates.
OME-NGFF Field |
xarray Location |
Notes |
|---|---|---|
|
Dataset.coords |
Converted to coordinate arrays via |
|
Dataset.coords |
Offset applied to coordinate arrays |
Original values |
|
Preserved for efficient round-tripping |
Example:
# OME-NGFF metadata:
# scale = {'z': 0.5, 'y': 0.36, 'x': 0.36}
# translation = {'z': 0.0, 'y': 0.0, 'x': 0.0}
ds = xr.open_dataset("image.ome.zarr", engine="ome-zarr")
# Coordinates derived from transforms
print(ds.coords['z'].values[:3]) # [0.0, 0.5, 1.0] (0 + 0.5 * [0,1,2,...])
print(ds.coords['y'].values[:3]) # [0.0, 0.36, 0.72]
# Original transforms preserved
print(ds.attrs['ome_scale']) # {'z': 0.5, 'y': 0.36, 'x': 0.36}
Round-trip:
When writing, coords_to_transforms() extracts scale and translation from coordinates, or uses stored values for exact fidelity.
multiscales#
OME-NGFF Spec: §2.4 Multiscales
Multiscales metadata describes the image pyramid structure.
OME-NGFF Field |
xarray Location |
Notes |
|---|---|---|
|
|
Image identifier |
|
|
OME-NGFF spec version |
|
Not currently mapped |
Downscaling method |
|
Not currently mapped |
Additional downscaling info |
|
|
List of resolution paths (e.g., |
Number of datasets |
|
Count of resolution levels |
|
Dataset.coords (per dataset) |
Applied per resolution level |
Example:
dt = xr.open_datatree("image.ome.zarr", engine="ome-zarr")
# Multiscale info in root attrs
print(dt.attrs['ome_name']) # 'image'
print(dt.attrs['ome_version']) # '0.4'
print(dt.attrs['ome_num_resolutions']) # 3
print(dt.attrs['ome_multiscale_paths']) # ['0', '1', '2']
# Each resolution as a separate DataTree node
print(list(dt.children.keys())) # ['scale0', 'scale1', 'scale2']
omero#
OME-NGFF Spec: §2.5 OMERO Metadata (Transitional)
OMERO metadata provides channel information and rendering settings.
OME-NGFF Field |
xarray Location |
Notes |
|---|---|---|
|
Dataset.coords[‘c’] |
Channel labels as coordinate values (string dtype) |
|
|
List of hex color codes (e.g., |
|
|
List of rendering window dicts |
Other OMERO fields |
|
Preserved in full metadata dict |
Example:
ds = xr.open_dataset("image.ome.zarr", engine="ome-zarr")
# Channel labels → coordinates (PRIMARY LOCATION)
print(ds.coords['c'].values) # array(['LaminB1', 'Dapi'], dtype='<U7')
print(ds.coords['c'].dtype) # dtype('<U7') (Unicode string)
# Select by channel name
lamin_data = ds.sel(c='LaminB1')
# Channel colors → attrs
print(ds.attrs['ome_channel_colors']) # ['0000FF', 'FFFF00']
# Rendering windows → attrs
print(ds.attrs['ome_channel_windows'][0])
# {'min': 0.0, 'max': 65535.0, 'start': 0.0, 'end': 1500.0}
Why channel labels are coordinates:
Channel labels represent actual data dimensions, making them perfect for xarray coordinates:
# Coordinate-based selection (natural xarray API)
ds.sel(c='DAPI')
# Coordinate-based filtering
ds.where(ds.c.isin(['DAPI', 'GFP']), drop=True)
# Coordinate-based iteration
for channel in ds.coords['c'].values:
process(ds.sel(c=channel))
labels#
OME-NGFF Spec: §2.6 Labels
Label images are not yet supported. When implemented, they will likely be stored as separate DataArrays or referenced paths.
plate / well#
OME-NGFF Spec: §2.7 Plate | §2.8 Well
HCS (High Content Screening) plate structures are not yet supported. See TODO.md in the project root for implementation notes.
Complete Attribute Reference#
Common Attributes (DataTree & Dataset)#
These attributes are present in both DataTree root nodes and individual Datasets:
attrs = {
# Basic metadata
'ome_name': 'image', # Image name
'ome_version': '0.4', # OME-NGFF version
# Axes information
'ome_axes_types': ['channel', 'space', ...], # Axis types
'ome_axes_units': {'z': 'micrometer', ...}, # Physical units (optional)
'ome_axes_orientations': {...}, # Anatomical orientations (optional)
# Multiscale info
'ome_num_resolutions': 3, # Number of pyramid levels
'ome_multiscale_paths': ['0', '1', '2'], # Resolution paths
# Channel metadata (if channels present)
'ome_channel_colors': ['0000FF', 'FFFF00'], # Hex colors
'ome_channel_windows': [{...}, {...}], # Rendering windows
# Complete metadata for round-tripping
'ome_ngff_metadata': {...}, # Full OME-NGFF metadata dict
}
Dataset-Only Attributes#
Datasets also contain coordinate transformation info:
attrs = {
# Coordinate transforms (for efficient round-trip)
'ome_scale': {'c': 1.0, 'z': 0.5, ...}, # Scale factors
'ome_translation': {'c': 0.0, 'z': 0.0, ...}, # Translation offsets
# Resolution level (only in open_ome_dataset())
'ome_ngff_resolution': 0, # Resolution index
}
Round-Trip Fidelity#
All metadata is preserved for perfect round-tripping:
# Read
ds = xr.open_dataset("input.ome.zarr", engine="ome-zarr")
# Modify data (coords/attrs preserved automatically)
ds_modified = ds * 2
# Write - metadata reconstructed from coords + attrs
from xarray_ome import write_ome_dataset
write_ome_dataset(ds_modified, "output.ome.zarr")
# Verify
ds2 = xr.open_dataset("output.ome.zarr", engine="ome-zarr")
assert ds2.attrs['ome_ngff_metadata'] == ds.attrs['ome_ngff_metadata']
The full OME-NGFF metadata dict is always preserved in attrs['ome_ngff_metadata'], ensuring that even unknown or future metadata fields survive round-tripping.
Implementation Details#
Conversion Functions#
Reading: OME-NGFF → xarray#
metadata_to_xarray_attrs(): Extracts non-coordinate metadata to attrstransforms_to_coords(): Converts scale/translation to coordinate arrays_extract_channel_labels(): Gets channel labels from omero.channels
Writing: xarray → OME-NGFF#
xarray_to_metadata(): Reconstructs OME-NGFF metadata from attrscoords_to_transforms(): Extracts scale/translation from coordinates
See xarray_ome/metadata.py and xarray_ome/transforms.py for implementation.
Version Support#
Reading: All OME-NGFF versions (v0.1 - v0.5) via ngff-zarr
Writing: v0.4 and v0.5 (current standard versions)
Version information is preserved in attrs['ome_version'].