Search code examples
pythonpandasmulti-index

Conditionally slice a pandas multiindex on specific level


For my given multi-indexed DataFrame:

df = pd.DataFrame(
    np.random.randn(12),
    index=[
        [1,1,2,3,4,4,5,5,6,6,7,8],
        [1,2,1,1,1,2,1,2,1,2,2,2],
    ]
)
            0
1 1  1.667692
  2  0.274428
2 1  0.216911
3 1 -0.513463
4 1 -0.642277
  2 -2.563876
5 1  2.301943
  2  1.455494
6 1 -1.539390
  2 -1.344079
7 2  0.300735
8 2  0.089269

I would like to slice it such that I keep only rows where second index level contains BOTH 1 and 2

            0
1 1  1.667692
  2  0.274428
4 1 -0.642277
  2 -2.563876
5 1  2.301943
  2  1.455494
6 1 -1.539390
  2 -1.344079

How can I do this?


Solution

  • Another possible solution, which is based on the following:

    • df.groupby(level=0) groups the dataframe by the first level of the index.

    • filter(lambda x: set(x.index.get_level_values(1)) == {1, 2}) checks if the second level of the index for each group contains both 1 and 2, and retains only the groups that meet this condition.

    df.groupby(level=0).filter(lambda x: set(x.index.get_level_values(1)) == {1, 2})
    

    Output:

                0
    1 1 -1.085631
      2  0.997345
    4 1 -0.578600
      2  1.651437
    5 1 -2.426679
      2 -0.428913
    6 1  1.265936
      2 -0.866740