I've got the following Pandas DataFrame:
import datetime as dt
import pandas as pd
import numpy as np
offset = 3 * pd.tseries.offsets.BMonthEnd()
bond_index_1 = pd.date_range('1/1/14', '1/1/18', freq=offset, name='date')
bond_1 = pd.DataFrame(data = np.random.uniform(0, 5, 16),
index= bond_index_1, columns=['cash_flow'])
bond_index_2 = pd.date_range('3/1/14', '3/1/21', freq=offset, name='date')
bond_2 = pd.DataFrame(data = 2*np.random.uniform(0, 10, 28),
index= bond_index_2, columns=['cash_flow'])
df_merged = pd.concat([bond_1, bond_2], keys=['Bond_1', 'Bond_2'])
How can I get a sum of the values in the cash_flow
column by a custom range, starting at 2014-6-30, ending at 2016-12-31 in 6 Month intervals.
Therefore, the intervals would look like 2014-06-30, 2015-12-31, 2015-06-30, 2015-12-31, 2016-06-30, 2016-12-31
It would also ignore the 'bond' name index in the MultiIndex.
I've tried using TimeGrouper
but couldn't get it to work since TimeGrouper
starts at the earliest value in your time series and moves forward.
resample seems one of the easy way to solve your problem.
print df_merged.reset_index().set_index('date').resample('6M', how='sum', closed='left', loffset='-1M')
yield,
cash_flow
date
2014-06-30 16.058478
2014-12-31 24.282106
2015-06-30 32.777176
2015-12-31 33.661801
2016-06-30 26.779571
2016-12-31 17.435089
2017-06-30 30.914194
2017-12-31 20.117823
2018-06-30 29.505178
2018-12-31 17.245787
2019-06-30 22.975058
2019-12-31 17.742220
2020-06-30 11.646266
2020-12-31 20.077632