Search code examples
pythonpandasdataframeresample

resample dataframe and divide values over new sample frequency


How do I upsample a dataframe using resample() to get the initial values divided over the new sample frequency?

Dataframe with monthly sample frequency

                       date        revenue
0 2021-11-01 00:00:00+00:00        300
1 2021-10-01 00:00:00+00:00        500
2 2021-09-01 00:00:00+00:00        100
3 2021-08-01 00:00:00+00:00        50
4 2021-07-01 00:00:00+00:00        200
5 2021-06-01 00:00:00+00:00        150

Approximate expected Dataframe with revenue divided over the days in that month

                                 revenue
date                                    
2021-06-01 00:00:00+00:00    4.8
2021-06-02 00:00:00+00:00    4.8
2021-06-03 00:00:00+00:00    4.8
2021-06-04 00:00:00+00:00    4.8
2021-06-05 00:00:00+00:00    4.8
...                                  ...
2021-11-28 00:00:00+00:00    9.6
2021-11-29 00:00:00+00:00    9.6
2021-11-30 00:00:00+00:00    9.6
2021-11-31 00:00:00+00:00    9.6

ie, i want to be sure that the values get divided over the amount of days in that sepcific month


Solution

  • You can use asfreq to convert the timeseries from monthly to daily frequency, then use ffill to forward fill the values then divide the revenue by daysinmonth attribute of datetimeindex to calculate distributed revenue

    s = df.set_index('date')
    s.loc[s.index.max() + pd.offsets.MonthEnd()] = np.nan
    
    s = s.asfreq('D').ffill()
    s['revenue'] /= s.index.daysinmonth
    

    print(s)
                                 revenue
    date                                
    2021-06-01 00:00:00+00:00   5.000000
    2021-06-02 00:00:00+00:00   5.000000
    2021-06-03 00:00:00+00:00   5.000000
    2021-06-04 00:00:00+00:00   5.000000
    2021-06-05 00:00:00+00:00   5.000000
    ...
    2021-07-24 00:00:00+00:00   6.451613
    2021-07-25 00:00:00+00:00   6.451613
    ...
    2021-11-30 00:00:00+00:00  10.000000