Search code examples
pandasmulti-index

Is there a way i could work with this multiindex?


I have a dataframe like this one, https://i.sstatic.net/2Sr29.png. RBD is a code that identifies each school, LET_CUR corresponds to a class and MRUN corresponds to the amount of students in each class, what i need is the following:

I would like to know how many of the schools have at least one class with more than 45 students, so far I haven't figured out yet a code to do that.

Thanks.


Solution

  • From your DataFrame :

    >>> import pandas as pd
    >>> from io import StringIO
    
    >>> df = pd.read_csv(StringIO("""
    RBD,LET_CUR,MRUN
    1,A,65
    1,B,23
    1,C,21
    2,A,22
    2,B,20
    2,C,34
    3,A,54
    4,A,23
    4,B,11
    5,A,15
    5,C,16
    6,A,76"""))
    >>> df = df.set_index(['RBD', 'LET_CUR'])
    >>> df
                MRUN
    RBD LET_CUR     
    1   A       65
        B       23
        C       21
    2   A       22
        B       20
        C       34
    3   A       54
    4   A       23
        B       11
    5   A       15
        C       16
    6   A       76
    

    As we want to know the number of school with at leat one class having more than 45 students, we can first filter the DataFrame on the column MRUN and then use the nunique() method to count the number of unique school :

    >>> df_filtered = df[df['MRUN'] > 45].reset_index()
    >>> df_filtered['RBD'].nunique()
    3