Search code examples
pythonpandaswindowrolling-computation

Calculate the rolling mean of every n-th element over an m-element window in python


Suppose I have a vector like so:

s = pd.Series(range(50))

The rolling sum over, let's say a 2-element window is easily calculated:

s.rolling(window=2, min_periods=2).mean()
0    NaN
1    0.5
2    1.5
3    2.5
4    3.5
5    4.5
6    5.5
7    6.5
8    7.5
9    8.5
...

Now I don't want to take the adjacent 2 elements for the window, but I want to take e.g. every third element. Still only take the last 2 of them. It would result in this vector:

0    NaN 
1    NaN
2    NaN
3    1.5 -- (3+0)/2
4    2.5 -- (4+1)/2
5    3.5 -- (5+2)/2
6    4.5 -- ...
7    5.5
8    6.5
9    7.5
...

How can I achieve this efficiently?

Thanks!


Solution

  • use stride parameter in the numpy.ndarray.strides attribute, which allows you to specify the number of bytes to step in each dimension when traversing an array.

    import numpy as np
    arr = np.arange(10)
    strided = np.lib.stride_tricks.as_strided(arr, shape=(len(arr)//3, 3), strides=(3*arr.itemsize, arr.itemsize))
    result = np.mean(strided[:, -2:], axis=1)
    

    output:

    array([1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5])