I would like to slice off just the last 10 rows of the second level of a multi-indexed dataframe, for each level 1 index. I have read up on loc, iloc, slice etc, but just cant seem to put things together to get what I need.
Here is some code to generate my dataframe:
import pandas as pd
df = pd.DataFrame({'Id': {('A', Timestamp('2008-10-31 00:00:00')): 45846, ('A', Timestamp('2009-10-31 00:00:00')): 45846, ('A', Timestamp('2010-10-31 00:00:00')): 45846, ('A', Timestamp('2011-10-31 00:00:00')): 45846, ('A', Timestamp('2012-10-31 00:00:00')): 45846, ('A', Timestamp('2013-10-31 00:00:00')): 45846, ('A', Timestamp('2014-10-31 00:00:00')): 45846, ('A', Timestamp('2015-10-31 00:00:00')): 45846, ('A', Timestamp('2016-10-31 00:00:00')): 45846, ('A', Timestamp('2017-10-31 00:00:00')): 45846, ('A', Timestamp('2018-10-31 00:00:00')): 45846, ('A', Timestamp('2019-10-31 00:00:00')): 45846, ('A', Timestamp('2020-10-31 00:00:00')): 45846, ('AA', Timestamp('2015-12-31 00:00:00')): 367153, ('AA', Timestamp('2016-12-31 00:00:00')): 367153, ('AA', Timestamp('2017-12-31 00:00:00')): 367153, ('AA', Timestamp('2018-12-31 00:00:00')): 367153, ('AA', Timestamp('2019-12-31 00:00:00')): 367153, ('AA', Timestamp('2020-12-31 00:00:00')): 367153}, 'Currency': {('A', Timestamp('2008-10-31 00:00:00')): 'USD', ('A', Timestamp('2009-10-31 00:00:00')): 'USD', ('A', Timestamp('2010-10-31 00:00:00')): 'USD', ('A', Timestamp('2011-10-31 00:00:00')): 'USD', ('A', Timestamp('2012-10-31 00:00:00')): 'USD', ('A', Timestamp('2013-10-31 00:00:00')): 'USD', ('A', Timestamp('2014-10-31 00:00:00')): 'USD', ('A', Timestamp('2015-10-31 00:00:00')): 'USD', ('A', Timestamp('2016-10-31 00:00:00')): 'USD', ('A', Timestamp('2017-10-31 00:00:00')): 'USD', ('A', Timestamp('2018-10-31 00:00:00')): 'USD', ('A', Timestamp('2019-10-31 00:00:00')): 'USD', ('A', Timestamp('2020-10-31 00:00:00')): 'USD', ('AA', Timestamp('2015-12-31 00:00:00')): 'USD', ('AA', Timestamp('2016-12-31 00:00:00')): 'USD', ('AA', Timestamp('2017-12-31 00:00:00')): 'USD', ('AA', Timestamp('2018-12-31 00:00:00')): 'USD', ('AA', Timestamp('2019-12-31 00:00:00')): 'USD', ('AA', Timestamp('2020-12-31 00:00:00')): 'USD'}, 'Fiscal Year': {('A', Timestamp('2008-10-31 00:00:00')): 2008, ('A', Timestamp('2009-10-31 00:00:00')): 2009, ('A', Timestamp('2010-10-31 00:00:00')): 2010, ('A', Timestamp('2011-10-31 00:00:00')): 2011, ('A', Timestamp('2012-10-31 00:00:00')): 2012, ('A', Timestamp('2013-10-31 00:00:00')): 2013, ('A', Timestamp('2014-10-31 00:00:00')): 2014, ('A', Timestamp('2015-10-31 00:00:00')): 2015, ('A', Timestamp('2016-10-31 00:00:00')): 2016, ('A', Timestamp('2017-10-31 00:00:00')): 2017, ('A', Timestamp('2018-10-31 00:00:00')): 2018, ('A', Timestamp('2019-10-31 00:00:00')): 2019, ('A', Timestamp('2020-10-31 00:00:00')): 2020, ('AA', Timestamp('2015-12-31 00:00:00')): 2015, ('AA', Timestamp('2016-12-31 00:00:00')): 2016, ('AA', Timestamp('2017-12-31 00:00:00')): 2017, ('AA', Timestamp('2018-12-31 00:00:00')): 2018, ('AA', Timestamp('2019-12-31 00:00:00')): 2019, ('AA', Timestamp('2020-12-31 00:00:00')): 2020}, 'Fiscal Period': {('A', Timestamp('2008-10-31 00:00:00')): 'FY', ('A', Timestamp('2009-10-31 00:00:00')): 'FY', ('A', Timestamp('2010-10-31 00:00:00')): 'FY', ('A', Timestamp('2011-10-31 00:00:00')): 'FY', ('A', Timestamp('2012-10-31 00:00:00')): 'FY', ('A', Timestamp('2013-10-31 00:00:00')): 'FY', ('A', Timestamp('2014-10-31 00:00:00')): 'FY', ('A', Timestamp('2015-10-31 00:00:00')): 'FY', ('A', Timestamp('2016-10-31 00:00:00')): 'FY', ('A', Timestamp('2017-10-31 00:00:00')): 'FY', ('A', Timestamp('2018-10-31 00:00:00')): 'FY', ('A', Timestamp('2019-10-31 00:00:00')): 'FY', ('A', Timestamp('2020-10-31 00:00:00')): 'FY', ('AA', Timestamp('2015-12-31 00:00:00')): 'FY', ('AA', Timestamp('2016-12-31 00:00:00')): 'FY', ('AA', Timestamp('2017-12-31 00:00:00')): 'FY', ('AA', Timestamp('2018-12-31 00:00:00')): 'FY', ('AA', Timestamp('2019-12-31 00:00:00')): 'FY', ('AA', Timestamp('2020-12-31 00:00:00')): 'FY'}, 'Publish Date': {('A', Timestamp('2008-10-31 00:00:00')): Timestamp('2008-12-19 00:00:00'), ('A', Timestamp('2009-10-31 00:00:00')): Timestamp('2009-12-21 00:00:00'), ('A', Timestamp('2010-10-31 00:00:00')): Timestamp('2010-12-20 00:00:00'), ('A', Timestamp('2011-10-31 00:00:00')): Timestamp('2011-12-16 00:00:00'), ('A', Timestamp('2012-10-31 00:00:00')): Timestamp('2012-12-20 00:00:00'), ('A', Timestamp('2013-10-31 00:00:00')): Timestamp('2013-12-19 00:00:00'), ('A', Timestamp('2014-10-31 00:00:00')): Timestamp('2014-12-22 00:00:00'), ('A', Timestamp('2015-10-31 00:00:00')): Timestamp('2015-12-21 00:00:00'), ('A', Timestamp('2016-10-31 00:00:00')): Timestamp('2016-12-20 00:00:00'), ('A', Timestamp('2017-10-31 00:00:00')): Timestamp('2017-12-21 00:00:00'), ('A', Timestamp('2018-10-31 00:00:00')): Timestamp('2018-12-20 00:00:00'), ('A', Timestamp('2019-10-31 00:00:00')): Timestamp('2019-12-19 00:00:00'), ('A', Timestamp('2020-10-31 00:00:00')): Timestamp('2020-12-18 00:00:00'), ('AA', Timestamp('2015-12-31 00:00:00')): Timestamp('2016-03-02 00:00:00'), ('AA', Timestamp('2016-12-31 00:00:00')): Timestamp('2017-02-03 00:00:00'), ('AA', Timestamp('2017-12-31 00:00:00')): Timestamp('2018-02-26 00:00:00'), ('AA', Timestamp('2018-12-31 00:00:00')): Timestamp('2019-02-26 00:00:00'), ('AA', Timestamp('2019-12-31 00:00:00')): Timestamp('2020-02-21 00:00:00'), ('AA', Timestamp('2020-12-31 00:00:00')): Timestamp('2021-02-25 00:00:00')}, 'Restated Date': {('A', Timestamp('2008-10-31 00:00:00')): Timestamp('2010-12-20 00:00:00'), ('A', Timestamp('2009-10-31 00:00:00')): Timestamp('2011-12-16 00:00:00'), ('A', Timestamp('2010-10-31 00:00:00')): Timestamp('2012-12-20 00:00:00'), ('A', Timestamp('2011-10-31 00:00:00')): Timestamp('2013-12-19 00:00:00'), ('A', Timestamp('2012-10-31 00:00:00')): Timestamp('2014-12-22 00:00:00'), ('A', Timestamp('2013-10-31 00:00:00')): Timestamp('2015-12-21 00:00:00'), ('A', Timestamp('2014-10-31 00:00:00')): Timestamp('2016-12-20 00:00:00'), ('A', Timestamp('2015-10-31 00:00:00')): Timestamp('2017-12-21 00:00:00'), ('A', Timestamp('2016-10-31 00:00:00')): Timestamp('2018-12-20 00:00:00'), ('A', Timestamp('2017-10-31 00:00:00')): Timestamp('2018-12-20 00:00:00'), ('A', Timestamp('2018-10-31 00:00:00')): Timestamp('2018-12-20 00:00:00'), ('A', Timestamp('2019-10-31 00:00:00')): Timestamp('2019-12-19 00:00:00'), ('A', Timestamp('2020-10-31 00:00:00')): Timestamp('2020-12-18 00:00:00'), ('AA', Timestamp('2015-12-31 00:00:00')): Timestamp('2017-03-15 00:00:00'), ('AA', Timestamp('2016-12-31 00:00:00')): Timestamp('2017-03-15 00:00:00'), ('AA', Timestamp('2017-12-31 00:00:00')): Timestamp('2018-02-26 00:00:00'), ('AA', Timestamp('2018-12-31 00:00:00')): Timestamp('2019-02-26 00:00:00'), ('AA', Timestamp('2019-12-31 00:00:00')): Timestamp('2020-02-21 00:00:00'), ('AA', Timestamp('2020-12-31 00:00:00')): Timestamp('2021-02-25 00:00:00')}})
Should yield this:
Id Currency Fiscal Year Fiscal Period Publish Date Restated Date
Ticker Report Date
A 2008-10-31 45846 USD 2008 FY 2008-12-19 2010-12-20
2009-10-31 45846 USD 2009 FY 2009-12-21 2011-12-16
2010-10-31 45846 USD 2010 FY 2010-12-20 2012-12-20
2011-10-31 45846 USD 2011 FY 2011-12-16 2013-12-19
2012-10-31 45846 USD 2012 FY 2012-12-20 2014-12-22
2013-10-31 45846 USD 2013 FY 2013-12-19 2015-12-21
2014-10-31 45846 USD 2014 FY 2014-12-22 2016-12-20
2015-10-31 45846 USD 2015 FY 2015-12-21 2017-12-21
2016-10-31 45846 USD 2016 FY 2016-12-20 2018-12-20
2017-10-31 45846 USD 2017 FY 2017-12-21 2018-12-20
2018-10-31 45846 USD 2018 FY 2018-12-20 2018-12-20
2019-10-31 45846 USD 2019 FY 2019-12-19 2019-12-19
2020-10-31 45846 USD 2020 FY 2020-12-18 2020-12-18
AA 2015-12-31 367153 USD 2015 FY 2016-03-02 2017-03-15
2016-12-31 367153 USD 2016 FY 2017-02-03 2017-03-15
2017-12-31 367153 USD 2017 FY 2018-02-26 2018-02-26
2018-12-31 367153 USD 2018 FY 2019-02-26 2019-02-26
2019-12-31 367153 USD 2019 FY 2020-02-21 2020-02-21
2020-12-31 367153 USD 2020 FY 2021-02-25 2021-02-25
What I would like after slicing is this:
Id Currency Fiscal Year Fiscal Period Publish Date Restated Date
Ticker Report Date
A 2011-10-31 45846 USD 2011 FY 2011-12-16 2013-12-19
2012-10-31 45846 USD 2012 FY 2012-12-20 2014-12-22
2013-10-31 45846 USD 2013 FY 2013-12-19 2015-12-21
2014-10-31 45846 USD 2014 FY 2014-12-22 2016-12-20
2015-10-31 45846 USD 2015 FY 2015-12-21 2017-12-21
2016-10-31 45846 USD 2016 FY 2016-12-20 2018-12-20
2017-10-31 45846 USD 2017 FY 2017-12-21 2018-12-20
2018-10-31 45846 USD 2018 FY 2018-12-20 2018-12-20
2019-10-31 45846 USD 2019 FY 2019-12-19 2019-12-19
2020-10-31 45846 USD 2020 FY 2020-12-18 2020-12-18
AA 2015-12-31 367153 USD 2015 FY 2016-03-02 2017-03-15
2016-12-31 367153 USD 2016 FY 2017-02-03 2017-03-15
2017-12-31 367153 USD 2017 FY 2018-02-26 2018-02-26
2018-12-31 367153 USD 2018 FY 2019-02-26 2019-02-26
2019-12-31 367153 USD 2019 FY 2020-02-21 2020-02-21
2020-12-31 367153 USD 2020 FY 2021-02-25 2021-02-25
We can use groupby tail
to get the last n
elements relative to level=0
:
filtered_df = df.groupby(level=0).tail(10)
*Note if going to assign values to avoid a SetWithCopyWarning
will need to copy
when subsetting the frame:
filtered_df = df.groupby(level=0).tail(10).copy()
filtered_df
:
Id Currency ... Publish Date Restated Date
A 2011-10-31 45846 USD ... 2011-12-16 2013-12-19
2012-10-31 45846 USD ... 2012-12-20 2014-12-22
2013-10-31 45846 USD ... 2013-12-19 2015-12-21
2014-10-31 45846 USD ... 2014-12-22 2016-12-20
2015-10-31 45846 USD ... 2015-12-21 2017-12-21
2016-10-31 45846 USD ... 2016-12-20 2018-12-20
2017-10-31 45846 USD ... 2017-12-21 2018-12-20
2018-10-31 45846 USD ... 2018-12-20 2018-12-20
2019-10-31 45846 USD ... 2019-12-19 2019-12-19
2020-10-31 45846 USD ... 2020-12-18 2020-12-18
AA 2015-12-31 367153 USD ... 2016-03-02 2017-03-15
2016-12-31 367153 USD ... 2017-02-03 2017-03-15
2017-12-31 367153 USD ... 2018-02-26 2018-02-26
2018-12-31 367153 USD ... 2019-02-26 2019-02-26
2019-12-31 367153 USD ... 2020-02-21 2020-02-21
2020-12-31 367153 USD ... 2021-02-25 2021-02-25
[16 rows x 6 columns]