reindex is very slow with small chunksizes #10054
Labels
bug
needs triage
Issue that has not been reviewed by xarray team member
topic-lazy array
topic-performance
What happened?
The lazy computation time seems to be dependent on the indexers size in
Dataset.reindex
.What did you expect to happen?
Close to constant time with lazy reindexing.
Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
Anything else we need to know?
This case shows up for example when using
ds.interp
with string variables.Environment
INSTALLED VERSIONS
commit: None
python: 3.12.4 | packaged by conda-forge | (main, Jun 17 2024, 10:04:44) [MSC v.1940 64 bit (AMD64)]
python-bits: 64
OS: Windows
OS-release: 10
machine: AMD64
processor: Intel64 Family 6 Model 58 Stepping 9, GenuineIntel
byteorder: little
LC_ALL: None
LANG: en
LOCALE: ('Swedish_Sweden', '1252')
libhdf5: 1.14.3
libnetcdf: 4.9.2
xarray: 2024.7.1.dev363+g99426cbb.d20240904
pandas: 2.2.2
numpy: 2.2.1
scipy: 1.14.1
netCDF4: 1.7.1
pydap: 3.5
h5netcdf: 1.3.0
h5py: 3.11.0
zarr: 2.18.2
cftime: 1.6.4
nc_time_axis: 1.4.1
iris: 3.9.0
bottleneck: 1.4.0
dask: 2024.11.2
distributed: 2024.11.2
matplotlib: 3.9.2
cartopy: 0.23.0
seaborn: 0.13.2
numbagg: None
fsspec: 2024.6.1
cupy: None
pint: None
sparse: None
flox: 0.9.10
numpy_groupies: 0.11.2
setuptools: 73.0.1
pip: 24.2
conda: None
pytest: 8.3.2
mypy: 1.14.1
IPython: 8.27.0
sphinx: 8.0.2
The text was updated successfully, but these errors were encountered: