I have a pandas timeseries with an index 'date' which has lots of duplicate dates. Now, I don't want to drop the duplicates as they refer to unique rows of data. As my data is published every week, I'd like to interpolate available data on a daily basis through forward fill.
Here's an example dataframe:
Product ticker lots
date
2012-01-05 M XYDZ Index 1000
2012-01-05 M XDZY Index 2000
2012-01-05 M DZXY Index 3000
2012-01-05 M YXZD Index 4000
2012-01-05 M ZDXY Index 5000
2012-01-12 M XYDZ Index 1150
2012-01-12 M XDZY Index 2250
2012-01-12 M DZXY Index 3350
2012-01-12 M YXZD Index 4450
2012-01-12 M ZDXY Index 5550
What I'd like to do is get this:
Product ticker lots
date
2012-01-05 M XYDZ Index 1000
2012-01-05 M XDZY Index 2000
2012-01-05 M DZXY Index 3000
2012-01-05 M YXZD Index 4000
2012-01-05 M ZDXY Index 5000
2012-01-06 M XYDZ Index 1000
2012-01-06 M XDZY Index 2000
2012-01-06 M DZXY Index 3000
2012-01-06 M YXZD Index 4000
2012-01-06 M ZDXY Index 5000 ...
and so on until the 12th, where it'd repeat the same until next available data.
Not sure of the exact expected logic, but if you have groups you could perform this within groupby
:
out = (df.groupby(['Product', 'ticker'], as_index=False)
.apply(lambda x: x.asfreq('D').ffill()).droplevel(0)
)
Output:
Product ticker lots
date
2012-01-05 M DZXY Index 3000.0
2012-01-06 M DZXY Index 3000.0
2012-01-07 M DZXY Index 3000.0
2012-01-08 M DZXY Index 3000.0
2012-01-09 M DZXY Index 3000.0
2012-01-10 M DZXY Index 3000.0
2012-01-11 M DZXY Index 3000.0
2012-01-12 M DZXY Index 3350.0
2012-01-05 M XDZY Index 2000.0
2012-01-06 M XDZY Index 2000.0
2012-01-07 M XDZY Index 2000.0
2012-01-08 M XDZY Index 2000.0
2012-01-09 M XDZY Index 2000.0
2012-01-10 M XDZY Index 2000.0
2012-01-11 M XDZY Index 2000.0