Search code examples
pythonpandaspandas-groupbydate-arithmetic

Custom Time Periods with Groupby


I've got the following Pandas DataFrame:

import datetime as dt
import pandas as pd
import numpy as np

offset = 3 * pd.tseries.offsets.BMonthEnd()

bond_index_1 = pd.date_range('1/1/14', '1/1/18', freq=offset, name='date')
bond_1 = pd.DataFrame(data = np.random.uniform(0, 5, 16),
                      index= bond_index_1, columns=['cash_flow'])

bond_index_2 = pd.date_range('3/1/14', '3/1/21', freq=offset, name='date')
bond_2 = pd.DataFrame(data = 2*np.random.uniform(0, 10, 28),
                      index= bond_index_2, columns=['cash_flow'])

df_merged = pd.concat([bond_1, bond_2], keys=['Bond_1', 'Bond_2'])

How can I get a sum of the values in the cash_flow column by a custom range, starting at 2014-6-30, ending at 2016-12-31 in 6 Month intervals.

Therefore, the intervals would look like 2014-06-30, 2015-12-31, 2015-06-30, 2015-12-31, 2016-06-30, 2016-12-31

It would also ignore the 'bond' name index in the MultiIndex.

I've tried using TimeGrouper but couldn't get it to work since TimeGrouper starts at the earliest value in your time series and moves forward.


Solution

  • resample seems one of the easy way to solve your problem.

    print df_merged.reset_index().set_index('date').resample('6M', how='sum', closed='left', loffset='-1M')
    

    yield,

                cash_flow
    date                 
    2014-06-30  16.058478
    2014-12-31  24.282106
    2015-06-30  32.777176
    2015-12-31  33.661801
    2016-06-30  26.779571
    2016-12-31  17.435089
    2017-06-30  30.914194
    2017-12-31  20.117823
    2018-06-30  29.505178
    2018-12-31  17.245787
    2019-06-30  22.975058
    2019-12-31  17.742220
    2020-06-30  11.646266
    2020-12-31  20.077632