Search code examples
pythonpandasmulti-index

Select slice of dataframe according to value of multiIndex


I have a multiIndex dataframe.

I am able to create a logical mask using the following:

df.index.get_level_values(0).to_series().str.find('1000')!=-1

This returns a boolean True for all the rows where the first index level contains the characters '1000'and False otherwise.

But I am not able to slice the dataframe using that mask.

I tried with:

df[df.index.get_level_values(0).to_series().str.find('1000')!=-1]

and it returned the following error:

ValueError: cannot reindex from a duplicate axis

I also tried with:

df[df.index.get_level_values(0).to_series().str.find('1000')!=-1,:]

which only returns the logic mask as output and the following error:

Length: 1755, dtype: bool, slice(None, None, None))' is an invalid key

Can someone point me to the right solution and to a good reference on how to slice properly a multiIndex dataframe?


Solution

  • One idea is remove to_series() and use Series.str.contains for test substring:

    df[df.index.get_level_values(0).str.contains('1000')]
    

    Another is convert mask to numpy array:

    df[df.index.get_level_values(0).str.contains('1000').values]
    

    Your solution with converting values of mask to array:

    df[(df.index.get_level_values(0).to_series().str.find('1000')!=-1).values]