Search code examples
pythonpandasdataframemulti-level

How to return all indexes in multiindex on ANY condition


I am trying to wrap my head around multilevel indices.

Specifically, i am trying to get all level 0 indicies that fullfill an 'ANY' criteria. But i can't for the life of me, understand how to get it to work.

For instance, in the dataframe below, we want all indicies that have a '3' in the column 'test_variable_2'

event_name test_variable_1 test_variable_2 test_variable_3
subject_id
1 pre_event NaN 3 foo
1 intra_event 15 NaN bar
1 post_event 30 NaN fum
2 pre_event NaN 2 foo
2 intra_event 45 NaN NaN
2 post_event 60 NaN fum
3 pre_event NaN 3 foo
3 intra_event_1 75 NaN bar
3 intra_event_2 90 NaN NaN
3 post_event 105 NaN fum

And the result should be:

event_name test_variable_1 test_variable_2 test_variable_3
subject_id
1 pre_event NaN 3 foo
1 intra_event 15 NaN bar
1 post_event 30 NaN fum
3 pre_event NaN 3 foo
3 intra_event_1 75 NaN bar
3 intra_event_2 90 NaN NaN
3 post_event 105 NaN fum

I thought about using the .groupby function, but I am worried that i loose some of the test-variables that contain several values. The solution i have thus far is to select the indicies that fullfill the boolean mask, and then drop all other indicies, but it seems cumbersome, and not very panda'esque.

I am certain there is a way of harnessing the multilevel indicies. Any pointers in the right direction would help.


Solution

  • Use groupby().transform:

    df[df['test_variable_2'].eq(3).groupby(level=0).transform('any')]
    

    Or you can just use unique on the valid indexes:

    df.loc[df[df.test_variable_2.eq(3)].index.unique()]
    

    Output:

                   event_name  test_variable_1  test_variable_2 test_variable_3
    subject_id                                                                 
    1               pre_event              NaN              3.0             foo
    1             intra_event             15.0              NaN             bar
    1              post_event             30.0              NaN             fum
    3               pre_event              NaN              3.0             foo
    3           intra_event_1             75.0              NaN             bar
    3           intra_event_2             90.0              NaN             NaN
    3              post_event            105.0              NaN             fum