You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When trying to concatenate data using a Pandas MultiIndex and then unstack it to get two independent dimensions (e.g. for varying different parameters in a simulation), the unstack errors. I have seen different errors with different data (MVE errors with ValueError: IndexVariable objects must be 1-dimensional, but my data errors with ValueError: cannot re-index or align objects with conflicting indexes found for the following dimensions: 'concat_dim' (2 conflicting indexes)).
One hint at the bug might be that conc._indexes shows more indexes then display(conc).
What did you expect to happen?
Originally (I think it was v2022.3.0) , it used to unstack neatly into the two levels of the multiindex as separate dimensions.
Minimal Complete Verifiable Example
importxarrayasxrimportnumpyasnpimportpandasaspdds=xr.Dataset(data_vars={"a": (("dim1", "dim2"), np.arange(16).reshape(4,4))}, coords={"dim1": list(range(4)), "dim2": list(range(2,6))})
dslist= [dsforiinrange(6)]
arrays= [
["bar", "bar", "baz", "baz", "foo", "foo"],
["one", "two", "one", "two", "one", "two"],
]
mindex=pd.MultiIndex.from_tuples(list(zip(*arrays)), names=["first", "second"])
conc=xr.concat(dslist, dim=mindex)
conc.unstack("concat_dim") # this errorsconc=xr.concat(dslist, dim='concat_dim')
conc=conc.assign_coords(dict(concat_dim=mindex)).unstack("concat_dim") # this does not
MVCE confirmation
Minimal example — the example is as focused as reasonably possible to demonstrate the underlying issue in xarray.
Complete example — the example is self-contained, including all data and the text of any traceback.
Verifiable example — the example copy & pastes into an IPython prompt or Binder notebook, returning the result.
New issue — a search of GitHub Issues suggests this is not a duplicate.
File ~/.conda/envs/xwrf-dev/lib/python3.10/site-packages/xarray/core/variable.py:1764, in Variable._unstack_once(self, index, dim, fill_value, sparse)
1759 # Indexer is a list of lists of locations. Each list is the locations
1760 # on the new dimension. This is robust to the data being sparse; in that
1761 # case the destinations will be NaN / zero.
1762 data[(..., *indexer)] = reordered
-> 1764 return self._replace(dims=new_dims, data=data)
File ~/.conda/envs/xwrf-dev/lib/python3.10/site-packages/xarray/core/variable.py:1017, in Variable._replace(self, dims, data, attrs, encoding)
1015 if encoding is _default:
1016 encoding = copy.copy(self._encoding)
-> 1017 return type(self)(dims, data, attrs, encoding, fastpath=True)
File ~/.conda/envs/xwrf-dev/lib/python3.10/site-packages/xarray/core/variable.py:2776, in IndexVariable.init(self, dims, data, attrs, encoding, fastpath)
2774 super().init(dims, data, attrs, encoding, fastpath)
2775 if self.ndim != 1:
-> 2776 raise ValueError(f"{type(self).name} objects must be 1-dimensional")
2778 # Unlike in Variable, always eagerly load values into memory
2779 if not isinstance(self._data, PandasIndexingAdapter):
ValueError: IndexVariable objects must be 1-dimensional
Looks like passing a pandas.MultiIndex object as dim argument to concat was forgotten during the explicit indexes refactor. While this can be fixed (could be tricky), we should deprecate it: it is convenient but probably too neat now that multi-indexes levels have their own, "real" coordinates (see #6293 (comment)). It should be preferred to explicitly chain concat with assign_coords (and set_index) like the last line in your example.
What happened?
When trying to concatenate data using a Pandas MultiIndex and then unstack it to get two independent dimensions (e.g. for varying different parameters in a simulation), the
unstack
errors. I have seen different errors with different data (MVE errors withValueError: IndexVariable objects must be 1-dimensional
, but my data errors withValueError: cannot re-index or align objects with conflicting indexes found for the following dimensions: 'concat_dim' (2 conflicting indexes)
).One hint at the bug might be that
conc._indexes
shows more indexes thendisplay(conc)
.What did you expect to happen?
Originally (I think it was v2022.3.0) , it used to unstack neatly into the two levels of the multiindex as separate dimensions.
Minimal Complete Verifiable Example
MVCE confirmation
Relevant log output
No response
Anything else we need to know?
No response
Environment
Skip to left side bar
/
/phd_scripts/jupyter/
Name
Last Modified
import xarray as xr
import numpy as np
import pandas as pd
ds = xr.Dataset(data_vars={"a": (("dim1", "dim2"), np.arange(16).reshape(4,4))}, coords={"dim1": list(range(4)), "dim2": list(range(2,6))})
dslist = [ds for i in range(6)]
arrays = [
]
mindex = pd.MultiIndex.from_tuples(list(zip(*arrays)), names=["first", "second"])
conc = xr.concat(dslist, dim=mindex)
conc.unstack("concat_dim")
ValueError Traceback (most recent call last)
Cell In [24], line 15
12 mindex = pd.MultiIndex.from_tuples(list(zip(*arrays)), names=["first", "second"])
14 conc = xr.concat(dslist, dim=mindex)
---> 15 conc.unstack("concat_dim")
File ~/.conda/envs/xwrf-dev/lib/python3.10/site-packages/xarray/core/dataset.py:4870, in Dataset.unstack(self, dim, fill_value, sparse)
4866 result = result._unstack_full_reindex(
4867 d, stacked_indexes[d], fill_value, sparse
4868 )
4869 else:
-> 4870 result = result._unstack_once(d, stacked_indexes[d], fill_value, sparse)
4871 return result
File ~/.conda/envs/xwrf-dev/lib/python3.10/site-packages/xarray/core/dataset.py:4706, in Dataset.unstack_once(self, dim, index_and_vars, fill_value, sparse)
4703 else:
4704 fill_value = fill_value
-> 4706 variables[name] = var.unstack_once(
4707 index=clean_index,
4708 dim=dim,
4709 fill_value=fill_value,
4710 sparse=sparse,
4711 )
4712 else:
4713 variables[name] = var
File ~/.conda/envs/xwrf-dev/lib/python3.10/site-packages/xarray/core/variable.py:1764, in Variable._unstack_once(self, index, dim, fill_value, sparse)
1759 # Indexer is a list of lists of locations. Each list is the locations
1760 # on the new dimension. This is robust to the data being sparse; in that
1761 # case the destinations will be NaN / zero.
1762 data[(..., *indexer)] = reordered
-> 1764 return self._replace(dims=new_dims, data=data)
File ~/.conda/envs/xwrf-dev/lib/python3.10/site-packages/xarray/core/variable.py:1017, in Variable._replace(self, dims, data, attrs, encoding)
1015 if encoding is _default:
1016 encoding = copy.copy(self._encoding)
-> 1017 return type(self)(dims, data, attrs, encoding, fastpath=True)
File ~/.conda/envs/xwrf-dev/lib/python3.10/site-packages/xarray/core/variable.py:2776, in IndexVariable.init(self, dims, data, attrs, encoding, fastpath)
2774 super().init(dims, data, attrs, encoding, fastpath)
2775 if self.ndim != 1:
-> 2776 raise ValueError(f"{type(self).name} objects must be 1-dimensional")
2778 # Unlike in Variable, always eagerly load values into memory
2779 if not isinstance(self._data, PandasIndexingAdapter):
ValueError: IndexVariable objects must be 1-dimensional
conc = xr.concat(dslist, dim='concat_dim')
conc = conc.assign_coords(dict(concat_dim=index)).unstack("concat_dim")
conc
xarray.Dataset
xr.show_versions()
INSTALLED VERSIONS
commit: None
python: 3.10.6 | packaged by conda-forge | (main, Aug 22 2022, 20:35:26) [GCC 10.4.0]
python-bits: 64
OS: Linux
OS-release: 4.18.0-305.25.1.el8_4.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: en_US.UTF-8
LANG: en_US.UTF-8
LOCALE: ('en_US', 'UTF-8')
libhdf5: 1.12.2
libnetcdf: 4.8.1
xarray: 2022.9.0
pandas: 1.5.0
numpy: 1.23.3
scipy: 1.9.1
netCDF4: 1.6.1
pydap: None
h5netcdf: 1.0.2
h5py: 3.7.0
Nio: None
zarr: None
cftime: 1.6.2
nc_time_axis: None
PseudoNetCDF: None
rasterio: None
cfgrib: None
iris: None
bottleneck: None
dask: 2022.9.2
distributed: 2022.9.2
matplotlib: 3.6.0
cartopy: 0.21.0
seaborn: None
numbagg: None
fsspec: 2022.8.2
cupy: None
pint: 0.19.2
sparse: None
flox: None
numpy_groupies: None
setuptools: 65.4.1
pip: 22.2.2
conda: None
pytest: None
IPython: 8.5.0
sphinx: None
The text was updated successfully, but these errors were encountered: