Search code examples
pythonpandascumsum

How to calculate cumulative weekly sum of values if I have daily values, using Pandas?


I am a novice user of Pandas. I have a dataframe that looks like this:

days rainfall
1    3.51
2    1.32
3    0
4    0
5    0
6    0
7    0
8    0
9    0.03
10   0
11   0
12   0.17
13   0.23
14   0.02
15   0
16   0
17   0
18   0.03
19   0.02
20   0
21   0

I would like to add a column (let's call it 'cumulative') that shows the cumulative rainfall values for every week. In other words, I want to calculate the cumulative values for the first seven days (1-7), then the second set of seven days (8-14), and so on.

The end product would look like this:

days rainfall cumulative
1    3.51     4.83
2    1.32     0.45
3    0        0.05
4    0
5    0
6    0
7    0
8    0
9    0.03
10   0
11   0
12   0.17
13   0.23
14   0.02
15   0
16   0
17   0
18   0.03
19   0.02
20   0
21   0

So far I've tried calling rolling with sum but I do not get what I want.

df['cumulative']=df['rainfall'].rolling(min_periods=7, window=7).sum()

Grateful for any tips or advice!


Solution

  • You can do that like this:

    import pandas as pd
    
    df = pd.DataFrame([
        [ 1, 3.51],
        [ 2, 1.32],
        [ 3, 0],
        [ 4, 0],
        [ 5, 0],
        [ 6, 0],
        [ 7, 0],
        [ 8, 0],
        [9, 0.03],
        [10, 0],
        [11, 0],
        [12, 0.17],
        [13, 0.23],
        [14, 0.02],
        [15, 0],
        [16, 0],
        [17, 0],
        [18, 0.03],
        [19, 0.02],
        [20, 0],
        [21, 0]], columns=['days', 'rainfall'])
    result = df['rainfall'].groupby((df['days'] - 1) // 7).sum().reset_index(drop=True)
    print(result)
    # In [418]: %paste -q
    # 0    4.83
    # 1    0.45
    # 2    0.05
    # Name: rainfall, dtype: float64