Search code examples
pythonpython-3.xpandasdataframedata-science

Pandas multiindex dataframe - Selecting max from one index within multiindex


I've got a dataframe with a multi index of Year and Month like the following

     |     |Value
Year |Month|  
     |  1  |  3
1992 |  2  |  5
     |  3  |  8
     | ... | ...
1993 |  1  |  2
     | ... | ...

I'm trying to select the maximum Value for each year and put that in a DF like this:

     | Max
Year |  
1992 |  5
1993 |  2
     | ... 

There's not much info on multi-indexes, should I simply do a group by and apply or something similar to make it more simple?


Solution

  • Exactly right:

    df.groupby(level=0).apply(max)
    

    In my sample DataFrame:

                         0
    Caps Lower            
    A    a     0  0.246490
               1 -1.265711
               2 -0.477415
               3 -0.355812
               4 -0.724521
         b     0 -0.409198
               1 -0.062552
               2 -0.731789
               3  1.131616
               4  0.085248
    B    a     0  0.193948
               1  2.010710
               2  0.289300
               3  0.305373
               4  1.376965
         b     0  0.210522
               1  1.431279
               2 -0.247171
               3  0.899074
               4  0.639926
    

    Result:

                 0
    Caps          
    A     1.131616
    B     2.010710
    

    This is how I created the DataFrame, by the way:

    df = pd.DataFrame(np.random.randn(5,4), columns = l)
    df.columns = pd.MultiIndex.from_tuples(df.columns, names=['Caps','Lower'])
    df = pd.DataFrame(df.unstack())