Search code examples
pythonpandasdataframedatetimemulti-index

in Python Pandas above 1.1.0 InvalidIndexError when slicing MultIndex frame with DatetimeIndex


My data contains timeline values for multiple areas. I want to slice according to date.

Here is my MultIndex Dataframe, I call Bob:

arrays = [[1,1,2,2],
          ['2020-01-06', '2020-01-13','2020-01-06', '2020-01-13']]
df = pd.DataFrame(np.transpose(arrays))
df[1] = pd.to_datetime(df[1])
index = pd.MultiIndex.from_frame(df, names=['zone', 'date'])
bob = pd.Series(np.random.randn(4), index=index)
print(bob)
zone  date      
1     2020-01-06   -0.513744
      2020-01-13    1.367461
2     2020-01-06    0.209916
      2020-01-13    0.397261

Now, I want to take a slice from a single datetime index and use it to get a slice of Bob. The following code works in Pandas 1.0.1 (and possibly older), but breaks in 1.1

print(pd.__version__)
singleIndex = pd.to_datetime(pd.Index(['2020-01-06', '2020-01-13']))
dateSlice = singleIndex[1:]
print(dateSlice)
idx = pd.IndexSlice
print(bob.loc[idx[:,dateSlice]])
1.0.1
DatetimeIndex(['2020-01-13'], dtype='datetime64[ns]', freq=None)
zone  date      
1     2020-01-13    1.367461
2     2020-01-13    0.397261
dtype: float64

and in 1.1.0

1.1.0
DatetimeIndex(['2020-01-13'], dtype='datetime64[ns]', freq=None)
Traceback (most recent call last):
  File "try.py", line 18, in <module>
    print(bob.loc[idx[:,dateSlice]])
  File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexing.py", line 873, in __getitem__
    return self._getitem_tuple(key)
  File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexing.py", line 1044, in _getitem_tuple
    return self._getitem_lowerdim(tup)
  File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexing.py", line 766, in _getitem_lowerdim
    return self._getitem_nested_tuple(tup)
  File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexing.py", line 826, in _getitem_nested_tuple
    result = self._handle_lowerdim_multi_index_axis0(tup)
  File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexing.py", line 1066, in _handle_lowerdim_multi_index_axis0
    return self._get_label(tup, axis=axis)
  File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexing.py", line 1059, in _get_label
    return self.obj.xs(label, axis=axis)
  File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/generic.py", line 3480, in xs
    loc, new_index = self.index.get_loc_level(key, drop_level=drop_level)
  File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexes/multi.py", line 2858, in get_loc_level
    k = self._get_level_indexer(k, level=i)
  File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexes/multi.py", line 2965, in _get_level_indexer
    code = self._get_loc_single_level_index(level_index, key)
  File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexes/multi.py", line 2634, in _get_loc_single_level_index
    return level_index.get_loc(key)
  File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexes/datetimes.py", line 586, in get_loc
    raise InvalidIndexError(key)
pandas.errors.InvalidIndexError: DatetimeIndex(['2020-01-13'], dtype='datetime64[ns]', freq=None)

I admit that MultiIndex confuses me greatly. How to correctly do this slicing in new pandas?


Solution

  • It seems not longer supported, you can use alternative with Index.get_level_values and Index.isin in boolean indexing:

    print(bob[bob.index.get_level_values(1).isin(dateSlice)])
    zone  date      
    1     2020-01-13   -1.396496
    2     2020-01-13   -0.504466
    dtype: float64
    

    With one string value it working a bit different:

    print(bob.loc[idx[:,'2020-01-13']])
    zone
    1   -0.200758
    2    0.410052
    dtype: float64
    
    
    print(bob.xs('2020-01-13', level=1, drop_level=False))
    zone  date      
    1     2020-01-13    1.129484
    2     2020-01-13    0.185156
    dtype: float64