Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extract_zarr_variable_encoding requires an optional argument #10070

Open
5 tasks done
aulemahal opened this issue Feb 21, 2025 · 1 comment
Open
5 tasks done

extract_zarr_variable_encoding requires an optional argument #10070

aulemahal opened this issue Feb 21, 2025 · 1 comment
Labels
bug needs triage Issue that has not been reviewed by xarray team member

Comments

@aulemahal
Copy link
Contributor

What happened?

The function xr.backends.zarr.extract_zarr_variable_encoding says that it's region argument is optional, but when it is not given, the function fails. It seems the code expects some iterable of the same length as the number of chunks.

What did you expect to happen?

Whatever region=None is supposed to mean for the code should be passed instead of None. I believe, region=[slice(None, None)] * len(variable.chunks) could work.

Minimal Complete Verifiable Example

import xarray as xr

ds = xr.tutorial.open_dataset('air_temperature').chunk(time=100)
ds.to_zarr('air.zarr')

ds = xr.open_zarr('air.zarr/')
xr.backends.zarr.extract_zarr_variable_encoding(ds.variables['air'], name='air')

MVCE confirmation

  • Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
  • Complete example — the example is self-contained, including all data and the text of any traceback.
  • Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
  • New issue — a search of GitHub Issues suggests this is not a duplicate.
  • Recent environment — the issue occurs with the latest version of xarray and its dependencies.

Relevant log output

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[31], line 1
----> 1 xr.backends.zarr.extract_zarr_variable_encoding(ds.variables['air'], name='air')

File ~/miniforge3/envs/rechunker/lib/python3.12/site-packages/xarray/backends/zarr.py:412, in extract_zarr_variable_encoding(variable, raise_on_invalid, name, safe_chunks, region, mode, shape)
    409         if k not in valid_encodings:
    410             del encoding[k]
--> 412 chunks = _determine_zarr_chunks(
    413     enc_chunks=encoding.get("chunks"),
    414     var_chunks=variable.chunks,
    415     ndim=variable.ndim,
    416     name=name,
    417     safe_chunks=safe_chunks,
    418     region=region,
    419     mode=mode,
    420     shape=shape,
    421 )
    422 encoding["chunks"] = chunks
    423 return encoding

File ~/miniforge3/envs/rechunker/lib/python3.12/site-packages/xarray/backends/zarr.py:268, in _determine_zarr_chunks(enc_chunks, var_chunks, ndim, name, safe_chunks, region, mode, shape)
    257 allow_partial_chunks = mode != "r+"
    259 base_error = (
    260     f"Specified zarr chunks encoding['chunks']={enc_chunks_tuple!r} for "
    261     f"variable named {name!r} would overlap multiple dask chunks {var_chunks!r} "
   (...)
    265     f"or modifying `encoding['chunks']`, or specify `safe_chunks=False`."
    266 )
--> 268 for zchunk, dchunks, interval, size in zip(
    269     enc_chunks_tuple, var_chunks, region, shape, strict=True
    270 ):
    271     if not safe_chunks:
    272         continue

TypeError: 'NoneType' object is not iterable

Anything else we need to know?

This doesn't happen with xarray < 2024.10, I believe the issue was introduced with #9559, where a region arg was added to the function with a None default.

To be frank, my actual issue is pangeo-data/rechunker#154, i.e. if the function is considered "private" and the solution is to pass region as I suggested, I will happily do a PR in rechuinker instead.

Environment

INSTALLED VERSIONS

commit: None
python: 3.12.9 | packaged by conda-forge | (main, Feb 14 2025, 08:00:06) [GCC 13.3.0]
python-bits: 64
OS: Linux
OS-release: 6.12.12-200.fc41.x86_64
machine: x86_64
processor:
byteorder: little
LC_ALL: None
LANG: fr_CA.UTF-8
LOCALE: ('fr_CA', 'UTF-8')
libhdf5: 1.14.4
libnetcdf: 4.9.2

xarray: 2024.10.0
pandas: 2.2.3
numpy: 2.1.3
scipy: 1.15.2
netCDF4: 1.7.2
pydap: None
h5netcdf: 1.5.0
h5py: 3.12.1
zarr: 2.18.4
cftime: 1.6.4
nc_time_axis: 1.4.1
iris: None
bottleneck: 1.4.2
dask: 2025.2.0
distributed: 2025.2.0
matplotlib: 3.10.0
cartopy: 0.24.0
seaborn: None
numbagg: None
fsspec: 2025.2.0
cupy: None
pint: 0.24.4
sparse: 0.15.5
flox: 0.10.0
numpy_groupies: 0.11.2
setuptools: 75.8.0
pip: 25.0.1
conda: None
pytest: 8.3.4
mypy: 1.15.0
IPython: 8.32.0
sphinx: 8.2.0

@aulemahal aulemahal added bug needs triage Issue that has not been reviewed by xarray team member labels Feb 21, 2025
@dcherian
Copy link
Contributor

This function is definitely private

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug needs triage Issue that has not been reviewed by xarray team member
Projects
None yet
Development

No branches or pull requests

2 participants