I have a time series data of the following form:
Item 2020 Jan 2020 Feb 2020 Mar 2020 Apr 2020 May 2020 Jun
0 A 0 1 2 3 4 5
1 B 5 4 3 2 1 0
This is monthly data but I want to get quarterly data of this data. A normal quarterly data would be calculated by summing up Jan-Mar and Apr-Jun and would look like this:
Item 2020 Q1 2020 Q2
0 A 3 12
1 B 12 3
I want to get smoother quarterly data so it would shift by only 1 month for each new data item, not 3 months. So it would have Jan-Mar, then Feb-Apr, then Mar-May, and Apr-Jun. So the resulting data would look like this:
Item 2020 Q1 2020 Q1 2020 Q1 2020 Q2
0 A 3 6 9 12
1 B 12 9 6 3
I believe this is similar to cumsum which can be used as follows:
df_dates = df.iloc[:,1:]
df_dates.cumsum(axis=1)
which leads to the following result:
2020 Jan 2020 Feb 2020 Mar 2020 Apr 2020 May 2020 Jun
0 0 1 3 6 10 15
1 5 9 12 14 15 15
but instead of getting the sum over the whole time, it gets the sum of the nearest 3 months (a quarter). I do not know how this version of cumsum is called but I saw it in many places so I believe there might be a library function for that.
Let us solve in steps
Item
column3
s = df.set_index('Item')
s.columns = pd.PeriodIndex(s.columns, freq='M').strftime('%Y Q%q')
s = s.rolling(3, axis=1).sum().shift(-2, axis=1).iloc[:, :-2]
print(s)
2020 Q1 2020 Q1 2020 Q1 2020 Q2
Item
A 3.0 6.0 9.0 12.0
B 12.0 9.0 6.0 3.0