My data contains timeline values for multiple areas. I want to slice according to date.
Here is my MultIndex Dataframe, I call Bob:
arrays = [[1,1,2,2],
['2020-01-06', '2020-01-13','2020-01-06', '2020-01-13']]
df = pd.DataFrame(np.transpose(arrays))
df[1] = pd.to_datetime(df[1])
index = pd.MultiIndex.from_frame(df, names=['zone', 'date'])
bob = pd.Series(np.random.randn(4), index=index)
print(bob)
zone date
1 2020-01-06 -0.513744
2020-01-13 1.367461
2 2020-01-06 0.209916
2020-01-13 0.397261
Now, I want to take a slice from a single datetime index and use it to get a slice of Bob. The following code works in Pandas 1.0.1 (and possibly older), but breaks in 1.1
print(pd.__version__)
singleIndex = pd.to_datetime(pd.Index(['2020-01-06', '2020-01-13']))
dateSlice = singleIndex[1:]
print(dateSlice)
idx = pd.IndexSlice
print(bob.loc[idx[:,dateSlice]])
1.0.1
DatetimeIndex(['2020-01-13'], dtype='datetime64[ns]', freq=None)
zone date
1 2020-01-13 1.367461
2 2020-01-13 0.397261
dtype: float64
and in 1.1.0
1.1.0
DatetimeIndex(['2020-01-13'], dtype='datetime64[ns]', freq=None)
Traceback (most recent call last):
File "try.py", line 18, in <module>
print(bob.loc[idx[:,dateSlice]])
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexing.py", line 873, in __getitem__
return self._getitem_tuple(key)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexing.py", line 1044, in _getitem_tuple
return self._getitem_lowerdim(tup)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexing.py", line 766, in _getitem_lowerdim
return self._getitem_nested_tuple(tup)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexing.py", line 826, in _getitem_nested_tuple
result = self._handle_lowerdim_multi_index_axis0(tup)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexing.py", line 1066, in _handle_lowerdim_multi_index_axis0
return self._get_label(tup, axis=axis)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexing.py", line 1059, in _get_label
return self.obj.xs(label, axis=axis)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/generic.py", line 3480, in xs
loc, new_index = self.index.get_loc_level(key, drop_level=drop_level)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexes/multi.py", line 2858, in get_loc_level
k = self._get_level_indexer(k, level=i)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexes/multi.py", line 2965, in _get_level_indexer
code = self._get_loc_single_level_index(level_index, key)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexes/multi.py", line 2634, in _get_loc_single_level_index
return level_index.get_loc(key)
File "/home/mix/.conda/envs/covidpro/lib/python3.6/site-packages/pandas/core/indexes/datetimes.py", line 586, in get_loc
raise InvalidIndexError(key)
pandas.errors.InvalidIndexError: DatetimeIndex(['2020-01-13'], dtype='datetime64[ns]', freq=None)
I admit that MultiIndex confuses me greatly. How to correctly do this slicing in new pandas?
It seems not longer supported, you can use alternative with Index.get_level_values
and Index.isin
in boolean indexing
:
print(bob[bob.index.get_level_values(1).isin(dateSlice)])
zone date
1 2020-01-13 -1.396496
2 2020-01-13 -0.504466
dtype: float64
With one string value it working a bit different:
print(bob.loc[idx[:,'2020-01-13']])
zone
1 -0.200758
2 0.410052
dtype: float64
print(bob.xs('2020-01-13', level=1, drop_level=False))
zone date
1 2020-01-13 1.129484
2 2020-01-13 0.185156
dtype: float64