I am working on a pandas dataframe and I have to replace column booking value if less than 25 with average of 7 days example x = 22-03-2020
has booking value 3, I have to replace it
x = average(before 3 days + last year same date + after 3 days/7))
After replace
I have many values to replace, manually i am able to do it but i need some shortcut.
You can use rolling
with center=True
, but need all values of days, so add Series.asfreq
, then add one year to DatetimeIndex
and because leap year is 2020 is added mean
for aggregate duplicates in end of February, last set values by mask:
df = pd.DataFrame({'date':pd.date_range('2019-03-19', periods=8),
'Booking':[92,109,144,109,122,76,78, 3]})
df.loc[7, 'date'] = pd.to_datetime('2020-03-22')
df = df.set_index('date')
s = df['Booking'].asfreq('d').rolling(7, center=True).mean()
s.index = s.index + pd.offsets.DateOffset(years=1)
s = s.mean(level=0)
print (s)
date
2020-03-19 NaN
2020-03-20 NaN
2020-03-21 NaN
2020-03-22 104.285714
2020-03-23 NaN
2021-03-18 NaN
2021-03-19 NaN
2021-03-20 NaN
2021-03-21 NaN
2021-03-22 NaN
Name: Booking, Length: 369, dtype: float64
mask = df['Booking'].lt(25)
df.loc[mask, 'Booking'] = s
print (df)
Booking
date
2019-03-19 92.000000
2019-03-20 109.000000
2019-03-21 144.000000
2019-03-22 109.000000
2019-03-23 122.000000
2019-03-24 76.000000
2019-03-25 78.000000
2020-03-22 104.285714