Search code examples
pythonpandasmulti-index

How can I get the last 10 rows of the second level of a multi-indexed dataframe?


I would like to slice off just the last 10 rows of the second level of a multi-indexed dataframe, for each level 1 index. I have read up on loc, iloc, slice etc, but just cant seem to put things together to get what I need.

Here is some code to generate my dataframe:

import pandas as pd
df = pd.DataFrame({'Id': {('A', Timestamp('2008-10-31 00:00:00')): 45846, ('A', Timestamp('2009-10-31 00:00:00')): 45846, ('A', Timestamp('2010-10-31 00:00:00')): 45846, ('A', Timestamp('2011-10-31 00:00:00')): 45846, ('A', Timestamp('2012-10-31 00:00:00')): 45846, ('A', Timestamp('2013-10-31 00:00:00')): 45846, ('A', Timestamp('2014-10-31 00:00:00')): 45846, ('A', Timestamp('2015-10-31 00:00:00')): 45846, ('A', Timestamp('2016-10-31 00:00:00')): 45846, ('A', Timestamp('2017-10-31 00:00:00')): 45846, ('A', Timestamp('2018-10-31 00:00:00')): 45846, ('A', Timestamp('2019-10-31 00:00:00')): 45846, ('A', Timestamp('2020-10-31 00:00:00')): 45846, ('AA', Timestamp('2015-12-31 00:00:00')): 367153, ('AA', Timestamp('2016-12-31 00:00:00')): 367153, ('AA', Timestamp('2017-12-31 00:00:00')): 367153, ('AA', Timestamp('2018-12-31 00:00:00')): 367153, ('AA', Timestamp('2019-12-31 00:00:00')): 367153, ('AA', Timestamp('2020-12-31 00:00:00')): 367153}, 'Currency': {('A', Timestamp('2008-10-31 00:00:00')): 'USD', ('A', Timestamp('2009-10-31 00:00:00')): 'USD', ('A', Timestamp('2010-10-31 00:00:00')): 'USD', ('A', Timestamp('2011-10-31 00:00:00')): 'USD', ('A', Timestamp('2012-10-31 00:00:00')): 'USD', ('A', Timestamp('2013-10-31 00:00:00')): 'USD', ('A', Timestamp('2014-10-31 00:00:00')): 'USD', ('A', Timestamp('2015-10-31 00:00:00')): 'USD', ('A', Timestamp('2016-10-31 00:00:00')): 'USD', ('A', Timestamp('2017-10-31 00:00:00')): 'USD', ('A', Timestamp('2018-10-31 00:00:00')): 'USD', ('A', Timestamp('2019-10-31 00:00:00')): 'USD', ('A', Timestamp('2020-10-31 00:00:00')): 'USD', ('AA', Timestamp('2015-12-31 00:00:00')): 'USD', ('AA', Timestamp('2016-12-31 00:00:00')): 'USD', ('AA', Timestamp('2017-12-31 00:00:00')): 'USD', ('AA', Timestamp('2018-12-31 00:00:00')): 'USD', ('AA', Timestamp('2019-12-31 00:00:00')): 'USD', ('AA', Timestamp('2020-12-31 00:00:00')): 'USD'}, 'Fiscal Year': {('A', Timestamp('2008-10-31 00:00:00')): 2008, ('A', Timestamp('2009-10-31 00:00:00')): 2009, ('A', Timestamp('2010-10-31 00:00:00')): 2010, ('A', Timestamp('2011-10-31 00:00:00')): 2011, ('A', Timestamp('2012-10-31 00:00:00')): 2012, ('A', Timestamp('2013-10-31 00:00:00')): 2013, ('A', Timestamp('2014-10-31 00:00:00')): 2014, ('A', Timestamp('2015-10-31 00:00:00')): 2015, ('A', Timestamp('2016-10-31 00:00:00')): 2016, ('A', Timestamp('2017-10-31 00:00:00')): 2017, ('A', Timestamp('2018-10-31 00:00:00')): 2018, ('A', Timestamp('2019-10-31 00:00:00')): 2019, ('A', Timestamp('2020-10-31 00:00:00')): 2020, ('AA', Timestamp('2015-12-31 00:00:00')): 2015, ('AA', Timestamp('2016-12-31 00:00:00')): 2016, ('AA', Timestamp('2017-12-31 00:00:00')): 2017, ('AA', Timestamp('2018-12-31 00:00:00')): 2018, ('AA', Timestamp('2019-12-31 00:00:00')): 2019, ('AA', Timestamp('2020-12-31 00:00:00')): 2020}, 'Fiscal Period': {('A', Timestamp('2008-10-31 00:00:00')): 'FY', ('A', Timestamp('2009-10-31 00:00:00')): 'FY', ('A', Timestamp('2010-10-31 00:00:00')): 'FY', ('A', Timestamp('2011-10-31 00:00:00')): 'FY', ('A', Timestamp('2012-10-31 00:00:00')): 'FY', ('A', Timestamp('2013-10-31 00:00:00')): 'FY', ('A', Timestamp('2014-10-31 00:00:00')): 'FY', ('A', Timestamp('2015-10-31 00:00:00')): 'FY', ('A', Timestamp('2016-10-31 00:00:00')): 'FY', ('A', Timestamp('2017-10-31 00:00:00')): 'FY', ('A', Timestamp('2018-10-31 00:00:00')): 'FY', ('A', Timestamp('2019-10-31 00:00:00')): 'FY', ('A', Timestamp('2020-10-31 00:00:00')): 'FY', ('AA', Timestamp('2015-12-31 00:00:00')): 'FY', ('AA', Timestamp('2016-12-31 00:00:00')): 'FY', ('AA', Timestamp('2017-12-31 00:00:00')): 'FY', ('AA', Timestamp('2018-12-31 00:00:00')): 'FY', ('AA', Timestamp('2019-12-31 00:00:00')): 'FY', ('AA', Timestamp('2020-12-31 00:00:00')): 'FY'}, 'Publish Date': {('A', Timestamp('2008-10-31 00:00:00')): Timestamp('2008-12-19 00:00:00'), ('A', Timestamp('2009-10-31 00:00:00')): Timestamp('2009-12-21 00:00:00'), ('A', Timestamp('2010-10-31 00:00:00')): Timestamp('2010-12-20 00:00:00'), ('A', Timestamp('2011-10-31 00:00:00')): Timestamp('2011-12-16 00:00:00'), ('A', Timestamp('2012-10-31 00:00:00')): Timestamp('2012-12-20 00:00:00'), ('A', Timestamp('2013-10-31 00:00:00')): Timestamp('2013-12-19 00:00:00'), ('A', Timestamp('2014-10-31 00:00:00')): Timestamp('2014-12-22 00:00:00'), ('A', Timestamp('2015-10-31 00:00:00')): Timestamp('2015-12-21 00:00:00'), ('A', Timestamp('2016-10-31 00:00:00')): Timestamp('2016-12-20 00:00:00'), ('A', Timestamp('2017-10-31 00:00:00')): Timestamp('2017-12-21 00:00:00'), ('A', Timestamp('2018-10-31 00:00:00')): Timestamp('2018-12-20 00:00:00'), ('A', Timestamp('2019-10-31 00:00:00')): Timestamp('2019-12-19 00:00:00'), ('A', Timestamp('2020-10-31 00:00:00')): Timestamp('2020-12-18 00:00:00'), ('AA', Timestamp('2015-12-31 00:00:00')): Timestamp('2016-03-02 00:00:00'), ('AA', Timestamp('2016-12-31 00:00:00')): Timestamp('2017-02-03 00:00:00'), ('AA', Timestamp('2017-12-31 00:00:00')): Timestamp('2018-02-26 00:00:00'), ('AA', Timestamp('2018-12-31 00:00:00')): Timestamp('2019-02-26 00:00:00'), ('AA', Timestamp('2019-12-31 00:00:00')): Timestamp('2020-02-21 00:00:00'), ('AA', Timestamp('2020-12-31 00:00:00')): Timestamp('2021-02-25 00:00:00')}, 'Restated Date': {('A', Timestamp('2008-10-31 00:00:00')): Timestamp('2010-12-20 00:00:00'), ('A', Timestamp('2009-10-31 00:00:00')): Timestamp('2011-12-16 00:00:00'), ('A', Timestamp('2010-10-31 00:00:00')): Timestamp('2012-12-20 00:00:00'), ('A', Timestamp('2011-10-31 00:00:00')): Timestamp('2013-12-19 00:00:00'), ('A', Timestamp('2012-10-31 00:00:00')): Timestamp('2014-12-22 00:00:00'), ('A', Timestamp('2013-10-31 00:00:00')): Timestamp('2015-12-21 00:00:00'), ('A', Timestamp('2014-10-31 00:00:00')): Timestamp('2016-12-20 00:00:00'), ('A', Timestamp('2015-10-31 00:00:00')): Timestamp('2017-12-21 00:00:00'), ('A', Timestamp('2016-10-31 00:00:00')): Timestamp('2018-12-20 00:00:00'), ('A', Timestamp('2017-10-31 00:00:00')): Timestamp('2018-12-20 00:00:00'), ('A', Timestamp('2018-10-31 00:00:00')): Timestamp('2018-12-20 00:00:00'), ('A', Timestamp('2019-10-31 00:00:00')): Timestamp('2019-12-19 00:00:00'), ('A', Timestamp('2020-10-31 00:00:00')): Timestamp('2020-12-18 00:00:00'), ('AA', Timestamp('2015-12-31 00:00:00')): Timestamp('2017-03-15 00:00:00'), ('AA', Timestamp('2016-12-31 00:00:00')): Timestamp('2017-03-15 00:00:00'), ('AA', Timestamp('2017-12-31 00:00:00')): Timestamp('2018-02-26 00:00:00'), ('AA', Timestamp('2018-12-31 00:00:00')): Timestamp('2019-02-26 00:00:00'), ('AA', Timestamp('2019-12-31 00:00:00')): Timestamp('2020-02-21 00:00:00'), ('AA', Timestamp('2020-12-31 00:00:00')): Timestamp('2021-02-25 00:00:00')}})

Should yield this:

                        Id      Currency  Fiscal Year Fiscal Period Publish Date Restated Date
Ticker Report Date                                                                         
A      2008-10-31      45846      USD         2008            FY   2008-12-19    2010-12-20
       2009-10-31      45846      USD         2009            FY   2009-12-21    2011-12-16
       2010-10-31      45846      USD         2010            FY   2010-12-20    2012-12-20
       2011-10-31      45846      USD         2011            FY   2011-12-16    2013-12-19
       2012-10-31      45846      USD         2012            FY   2012-12-20    2014-12-22
       2013-10-31      45846      USD         2013            FY   2013-12-19    2015-12-21
       2014-10-31      45846      USD         2014            FY   2014-12-22    2016-12-20
       2015-10-31      45846      USD         2015            FY   2015-12-21    2017-12-21
       2016-10-31      45846      USD         2016            FY   2016-12-20    2018-12-20
       2017-10-31      45846      USD         2017            FY   2017-12-21    2018-12-20
       2018-10-31      45846      USD         2018            FY   2018-12-20    2018-12-20
       2019-10-31      45846      USD         2019            FY   2019-12-19    2019-12-19
       2020-10-31      45846      USD         2020            FY   2020-12-18    2020-12-18
AA     2015-12-31     367153      USD         2015            FY   2016-03-02    2017-03-15
       2016-12-31     367153      USD         2016            FY   2017-02-03    2017-03-15
       2017-12-31     367153      USD         2017            FY   2018-02-26    2018-02-26
       2018-12-31     367153      USD         2018            FY   2019-02-26    2019-02-26
       2019-12-31     367153      USD         2019            FY   2020-02-21    2020-02-21
       2020-12-31     367153      USD         2020            FY   2021-02-25    2021-02-25

What I would like after slicing is this:

                        Id      Currency  Fiscal Year Fiscal Period Publish Date Restated Date
Ticker Report Date                                                                         
A      2011-10-31      45846      USD         2011            FY   2011-12-16    2013-12-19
       2012-10-31      45846      USD         2012            FY   2012-12-20    2014-12-22
       2013-10-31      45846      USD         2013            FY   2013-12-19    2015-12-21
       2014-10-31      45846      USD         2014            FY   2014-12-22    2016-12-20
       2015-10-31      45846      USD         2015            FY   2015-12-21    2017-12-21
       2016-10-31      45846      USD         2016            FY   2016-12-20    2018-12-20
       2017-10-31      45846      USD         2017            FY   2017-12-21    2018-12-20
       2018-10-31      45846      USD         2018            FY   2018-12-20    2018-12-20
       2019-10-31      45846      USD         2019            FY   2019-12-19    2019-12-19
       2020-10-31      45846      USD         2020            FY   2020-12-18    2020-12-18
AA     2015-12-31     367153      USD         2015            FY   2016-03-02    2017-03-15
       2016-12-31     367153      USD         2016            FY   2017-02-03    2017-03-15
       2017-12-31     367153      USD         2017            FY   2018-02-26    2018-02-26
       2018-12-31     367153      USD         2018            FY   2019-02-26    2019-02-26
       2019-12-31     367153      USD         2019            FY   2020-02-21    2020-02-21
       2020-12-31     367153      USD         2020            FY   2021-02-25    2021-02-25

Solution

  • We can use groupby tail to get the last n elements relative to level=0:

    filtered_df = df.groupby(level=0).tail(10)
    

    *Note if going to assign values to avoid a SetWithCopyWarning will need to copy when subsetting the frame:

    filtered_df = df.groupby(level=0).tail(10).copy()
    

    filtered_df:

                       Id Currency  ...  Publish Date Restated Date
    A  2011-10-31   45846      USD  ...    2011-12-16    2013-12-19
       2012-10-31   45846      USD  ...    2012-12-20    2014-12-22
       2013-10-31   45846      USD  ...    2013-12-19    2015-12-21
       2014-10-31   45846      USD  ...    2014-12-22    2016-12-20
       2015-10-31   45846      USD  ...    2015-12-21    2017-12-21
       2016-10-31   45846      USD  ...    2016-12-20    2018-12-20
       2017-10-31   45846      USD  ...    2017-12-21    2018-12-20
       2018-10-31   45846      USD  ...    2018-12-20    2018-12-20
       2019-10-31   45846      USD  ...    2019-12-19    2019-12-19
       2020-10-31   45846      USD  ...    2020-12-18    2020-12-18
    AA 2015-12-31  367153      USD  ...    2016-03-02    2017-03-15
       2016-12-31  367153      USD  ...    2017-02-03    2017-03-15
       2017-12-31  367153      USD  ...    2018-02-26    2018-02-26
       2018-12-31  367153      USD  ...    2019-02-26    2019-02-26
       2019-12-31  367153      USD  ...    2020-02-21    2020-02-21
       2020-12-31  367153      USD  ...    2021-02-25    2021-02-25
    
    [16 rows x 6 columns]