I am a novice user of Pandas. I have a dataframe that looks like this:
days rainfall
1 3.51
2 1.32
3 0
4 0
5 0
6 0
7 0
8 0
9 0.03
10 0
11 0
12 0.17
13 0.23
14 0.02
15 0
16 0
17 0
18 0.03
19 0.02
20 0
21 0
I would like to add a column (let's call it 'cumulative') that shows the cumulative rainfall values for every week. In other words, I want to calculate the cumulative values for the first seven days (1-7), then the second set of seven days (8-14), and so on.
The end product would look like this:
days rainfall cumulative
1 3.51 4.83
2 1.32 0.45
3 0 0.05
4 0
5 0
6 0
7 0
8 0
9 0.03
10 0
11 0
12 0.17
13 0.23
14 0.02
15 0
16 0
17 0
18 0.03
19 0.02
20 0
21 0
So far I've tried calling rolling
with sum
but I do not get what I want.
df['cumulative']=df['rainfall'].rolling(min_periods=7, window=7).sum()
Grateful for any tips or advice!
You can do that like this:
import pandas as pd
df = pd.DataFrame([
[ 1, 3.51],
[ 2, 1.32],
[ 3, 0],
[ 4, 0],
[ 5, 0],
[ 6, 0],
[ 7, 0],
[ 8, 0],
[9, 0.03],
[10, 0],
[11, 0],
[12, 0.17],
[13, 0.23],
[14, 0.02],
[15, 0],
[16, 0],
[17, 0],
[18, 0.03],
[19, 0.02],
[20, 0],
[21, 0]], columns=['days', 'rainfall'])
result = df['rainfall'].groupby((df['days'] - 1) // 7).sum().reset_index(drop=True)
print(result)
# In [418]: %paste -q
# 0 4.83
# 1 0.45
# 2 0.05
# Name: rainfall, dtype: float64