Search code examples
pythonpandasbooleanmulti-indexbooleanquery

Pandas multiindex boolean indexing


So given a multiindexed dataframe, I would like to return only rows that satisfy a condition for all levels of the lower index in a multi index. Here is a small working example:

df = pd.DataFrame({'a': [1, 1, 2, 2], 'b': [1, 2, 3, 4], 'c': [0, 2, 2, 2]})
df = df.set_index(['a', 'b'])

print(df)

out:

     c
a b   
1 1  0
  2  2
2 3  2
  4  2

Now, I would like to return the entries for which c > 1. For instance, I would like to do something like

df[df[c > 1]]

out:

     c
a b   
1 2  2
2 3  2
  4  2

But I want to get

out:

     c
a b   
2 3  2
  4  2

Any thoughts on how to do this in the most efficient way?


Solution

  • I ended up using groupby:

    df.groupby(level=0).filter(lambda x: all([c > 1 for v in x['c']]))