Search code examples
pythonpandasmulti-index

Dropping last row from multiindex dataframe


I have a Pandas dataframe that looks something like this:

                   close      volume
date       ticker                    
2017-01-03 AAPL    116.15  28781865.0
           AMZN    753.67   3521066.0
           MSFT     62.58  20694101.0
           TSLA    216.99   5923254.0
2017-01-04 AAPL    116.02  21118116.0
           AMZN    757.18   2510526.0
           MSFT     62.30  21339969.0
           TSLA    226.99  11213471.0
2017-01-05 AAPL    116.61  22193587.0
           AMZN    780.45   5830068.0
           MSFT     62.30  24875968.0
           TSLA    226.75   5911695.0
2017-01-06 AAPL    117.91  31751900.0
           AMZN    795.99   5986234.0
           MSFT     62.84  19922919.0
           TSLA    229.01   5527893.0
2017-01-09 AAPL    118.99  33561948.0
           AMZN    796.92   3446109.0
           MSFT     62.64  20382730.0
           TSLA    231.28   3979484.0
2017-01-10 AAPL    119.11  24462051.0
           AMZN    795.90   2558369.0
           MSFT     62.62  18593004.0
           TSLA    229.87   3659955.0

I would like to drop all rows in the 'date' index level except for the most recent date, which is always the last row. So in this case the result should be:

                   close      volume
date       ticker                    
2017-01-10 AAPL    119.11  24462051.0
           AMZN    795.90   2558369.0
           MSFT     62.62  18593004.0
           TSLA    229.87   3659955.0

I have tried

pricing.drop(pricing.index[0:len(pricing)-1])

But this looks at the 'ticker' level instead of the date level, returning only the very last row instead of all rows for the last date:

                   close      volume
date       ticker                    

2017-01-10 TSLA    229.87   3659955.0           

I tried adding level=0 or level='date' to it but then it would just return the full dataframe with nothing dropped.

Does anyone know a good way to do this?


Solution

  • Use Index.get_level_values for values from first level, select last by indexing and last select by DataFrame.xs with drop_level for avoid remove first level:

    df = df.xs(df.index.get_level_values(0)[-1], drop_level=False)
    print (df)
                        close      volume
    date       ticker                    
    2017-01-10 AAPL    119.11  24462051.0
               AMZN    795.90   2558369.0
               MSFT     62.62  18593004.0
               TSLA    229.87   3659955.0
    

    Another solution:

    df = df.loc[[df.index[-1][0]], :]
    print (df)
                        close      volume
    date       ticker                    
    2017-01-10 AAPL    119.11  24462051.0
               AMZN    795.90   2558369.0
               MSFT     62.62  18593004.0
               TSLA    229.87   3659955.0
    

    Details:

    print (df.index[-1])
    ('2017-01-10', 'TSLA')
    
    print (df.index[-1][0])
    2017-01-10