Search code examples
pythonpython-3.xpandasdataframeaverage

Pandas: How to calculate average one value to after another (succeeding average)


Imagine a dataset like below:


result  country          start      end
5       A              2/14/2022    2/21/2022
10      A              2/21/2022    2/28/2022
30      B              2/28/2022    3/7/2022
50      C               1/3/2022    1/10/2022
60      C              1/10/2022    1/17/2022
70      D              1/17/2022    1/24/2022
40      E              1/24/2022    1/31/2022
20      E              1/31/2022    2/7/2022
30      A              2/7/2022     2/14/2022
20      B              2/14/2022    2/21/2022

Expected output

I need to do groupby (country, start, and end) and the result column should add existing value with the above value and need to populate the average column. For example:

groupby country, start, and end with result and average column is nothing but 5, 5+10/2, 10+30/2, 30+50/2,50+60/2

result   average 
5         5            eg: (5)     
10        7.5           (5+10/2)  #resultcol of existingvalue + abovevalue divided by 2 = average
30        20            (10+30/2)
50        40            (30+50/2)
60        55            (50+60/2)
70        65            ...
40        55            ...
20        30            ...
30        25            ...
20        25            ...

Solution

  • Try this solution with grouping by country and date, however it may raise error if there is no sufficient data in a subset (i.e. larger than 2):

    df_data['average'] = df_data.groupby(['country', 'date'])['result'].rolling(2, min_periods=1).mean().reset_index(0, drop=True)
    

    In case you want to group by country only

    df_data['average'] = df_data.groupby(['country'])['result'].rolling(2, min_periods=1).mean().reset_index(0, drop=True)
    
    df_data
      country       date  result  average
    0       A  2/14/2022       5      5.0
    1       A  2/21/2022      10      7.5
    2       B  2/28/2022      30     30.0
    3       C   1/3/2022      50     50.0
    4       C  1/10/2022      60     55.0
    5       D  1/17/2022      70     70.0
    6       E  1/24/2022      40     40.0
    7       E  1/31/2022      20     30.0
    8       A   2/7/2022      30     20.0
    9       B  2/14/2022      20     25.0