I have a multiIndex dataframe.
I am able to create a logical mask using the following:
df.index.get_level_values(0).to_series().str.find('1000')!=-1
This returns a boolean True
for all the rows where the first index level contains the characters '1000'
and False
otherwise.
But I am not able to slice the dataframe using that mask.
I tried with:
df[df.index.get_level_values(0).to_series().str.find('1000')!=-1]
and it returned the following error:
ValueError: cannot reindex from a duplicate axis
I also tried with:
df[df.index.get_level_values(0).to_series().str.find('1000')!=-1,:]
which only returns the logic mask as output and the following error:
Length: 1755, dtype: bool, slice(None, None, None))' is an invalid key
Can someone point me to the right solution and to a good reference on how to slice properly a multiIndex dataframe?
One idea is remove to_series()
and use Series.str.contains
for test substring:
df[df.index.get_level_values(0).str.contains('1000')]
Another is convert mask to numpy array:
df[df.index.get_level_values(0).str.contains('1000').values]
Your solution with converting values of mask to array:
df[(df.index.get_level_values(0).to_series().str.find('1000')!=-1).values]