Search code examples
pythonpandasrolling-average

Exclude center value from centered rolling average in Pandas


I want to calculate the centered rolling average in Pandas, where the center value is excluded.

The code below works perfectly, but it includes the center value:

df.Value.rolling(window=11, center=True, min_periods=1).mean()

The window=11 and min_periods=1 are important, since I want to calculate the average for every value in df.

For example,

  • for the second value, it should consider 5 values below and 1 that is above the current value. (Current code includes the second value)
  • for the last value, it should consider 5 values above, but exclude the last one. (Current code considers the last 6 values)

Here is a table with the expected results:

Values Expected
13313 27180.6
12792 28897
20254 28770.14286
34915 27468.5
31410 29037
36532 30028.5
36958 34071.1
35471 36600.66667
33018 38950.625
38080 38804
44074 39037.33333
54165 37520.2

And calculation method from Excel:

Example Table


Solution

  • updated answer

    You can modify my first approach of the initial answer to have a dynamic division:

    win = 11
    roll = df['Values'].rolling(window=win, center=True, min_periods=1)
    
    df['out'] = roll.sum().sub(df['Values']).div(roll.count()-1)
    

    Output:

        Values     Expected           out
    0    13313  27180.60000  27180.600000
    1    12792  28897.00000  28897.000000
    2    20254  28770.14286  28770.142857
    3    34915  27468.50000  27468.500000
    4    31410  29037.00000  29037.000000
    5    36532  30028.50000  30028.500000
    6    36958  34071.10000  34071.100000
    7    35471  36600.66667  36600.666667
    8    33018  38950.62500  38950.625000
    9    38080  38804.00000  38804.000000
    10   44074  39037.33333  39037.333333
    11   54165  37520.20000  37520.200000
    

    initial answer

    You have different ways, whether or not the operation is really a mean:

    np.random.seed(0)
    df = pd.DataFrame({'Value': np.random.randint(0, 10, 10)})
    
    win = 3
    
    df['mean'] = df['Value'].rolling(window=win, center=True, min_periods=1).mean()
    
    df['mean_without_center'] = (df['Value'].rolling(window=win, center=True, min_periods=1)
                                 .sum().sub(df['Value']).div(win-1)
                                )
    
    half = win//2
    mask = [False]*half+[True]+[False]*half
    
    df['mean_without_center_apply'] = (df['Value'].rolling(window=win, center=True, min_periods=1)
                                 .apply(lambda s: s.mask(mask[:len(s)]).mean())
                                )
    

    Output:

       Value      mean  mean_without_center  mean_without_center_apply
    0      5  2.500000                  0.0                        5.0
    1      0  2.666667                  4.0                        4.0
    2      3  2.000000                  1.5                        1.5
    3      3  4.333333                  5.0                        5.0
    4      7  6.333333                  6.0                        6.0
    5      9  6.333333                  5.0                        5.0
    6      3  5.666667                  7.0                        7.0
    7      5  3.333333                  2.5                        2.5
    8      2  3.666667                  4.5                        4.5
    9      4  3.000000                  1.0                        2.0
    

    Or with 's sliding_window_view:

    from numpy.lib.stride_tricks import sliding_window_view as swv
    
    win = 3
    half = win//2
    
    df['swv_mean'] = np.nanmean(swv(np.pad(df['Value'].astype(float), (win-1, 0),
                                           constant_values=np.nan),
                                    win)
                                [:, np.r_[:half, half+1:win]],
                                axis=1
                               )
    

    Output:

       Value  swv_mean
    0      5       5.0
    1      0       0.0
    2      3       4.0
    3      3       1.5
    4      7       5.0
    5      9       6.0
    6      3       5.0
    7      5       7.0
    8      2       2.5
    9      4       4.5