python pandas dataframe slice multi-index

pandas slice 3-level multiindex based on a list with 2 levels

Here is a minimal example:

import pandas as pd
import numpy as np
np.random.seed(0)
idx = pd.MultiIndex.from_product([[1,2,3], ['a', 'b', 'c'], [6, 7]])
df = pd.DataFrame(np.random.randn(18), index=idx)
selection = [(1, 'a'), (2, 'b')]

I would like to select all the rows in df that have as index that starts with any of the items in selection. So I would like to get the sub dataframe of df with the indices:

(1, 'a', 6), (1, 'a', 7), (2, 'b', 6), (2, 'b', 7)

What is the most straightforward/pythonian/pandasian way of doing this? What I found:

sel = [id[:2] in selection for id in df.index]
df.loc[sel]

Solution

You could use boolean indexing with isin:

out = df[df.index.isin(selection)]

Output:

              0
1 a 6  1.560268
    7  0.674709
2 b 6  0.848069
    7  0.130719

If you want to select other levels, drop the unused leading levels:

# here we want to select on levels 1 and 2
selection = [('a', 6), ('b', 7)]

df[df.index.droplevel(0).isin(selection)]

Output:

              0
1 a 6  1.560268
  b 7  0.137769
2 a 6  0.754946
  b 7  0.130719
3 a 6 -2.275646
  b 7 -2.199944