Search code examples
pythonpandasmulti-index

Slicing pandas multiindex dataframe using max of second level


Supposing that I have this MultiIndex dataframe called df:

     |     |Value
Year |Month|  
1992 |  1  |  3
     |  2  |  5
     |  3  |  8
-----------------
1993 |  1  |  2
     |  2  |  7
----------------
1994 |  1  |  20
     |  2  |  50
     |  3  |  10
     |  4  |  5

How do I select all years and max month for each of those years?

I'd like the following result:

     |     |Value
Year |Month|  
1992 |  3  |  8 
-----------------
1993 |  2  |  7     
----------------
1994 |  4  |  5

I've tried to use

df.loc[(slice(None), [3, 2, 4]),:]

This works, but it's hard-coded. How do I set it to bring always the maximum month level instead of saying it manually?

My index are sorted, so it would be take the last month for each year.

I've also tried to use the .iloc but it doesn't work with multiindex

>>> df.iloc[(slice(None), -1),:]

...
IndexingError: Too many indexers
...

Solution

  • you can group on the first level and take the last of the second level and then df.loc[]:

    df.loc[pd.DataFrame.from_records(df.index).groupby(0)[1].last().items()]
    

                 Value
    Year Month       
    1992 3          8
    1993 2          7
    1994 4          5