Search code examples
pythonpandasdataframewindow-functionspandasql

how to create a sliding window and apply functions in pandas dataframes


I have the following dataframe :

Time      A

1         1
2         1
3         1
4         1
5         2
6         2
7         3
8         3
9         2
10        1
11        1
12        1
13        3
14        3
15        3

need to create a sliding window with length of 3 which slides 2 steps over time column and apply some costume function to column A ( for the sake of this examplelet's say mean and max)

the r equivalent for it would be

 dat %>% dplyr::mutate(SMA_A=rollapplyr(A, 3, mean ,by = 2,align ="center", partial=TRUE, fill=NA),
                  Max_A =rollapplyr(A, 3, max ,by=2, align ="center", partial=TRUE,fill=NA)
                 ) 

expected output :

    Time  A     SMA_A    Max_A
      1   1    1.000000     1
      2   1       NA        NA
      3   1 1.000000        1
      4   1       NA        NA
      5   2 1.666667        2
      6   2       NA        NA
      7   3 2.666667         3
      8   3       NA        NA
      9   2 2.000000        3
      10  1       NA        NA
      11  1 1.000000        1
      12  1       NA        NA
      13  3 2.333333        3
      14  3       NA        NA
      15  3 3.000000        3

or without NAs


Solution

  • You can do the following:

    1. Calculate rolling mean and max with DataFrame.rolling
    2. Assign them as new columns
    3. Get every 2nd index row by checking if the remainder == 1 which means it's an uneven number
    4. For every index in step3, assign NaN with .loc
    df['A'] = df.rolling(3, center=True)['A'].mean().bfill().ffill()
    df['Max_A'] = df.rolling(3, center=True)['A'].max().bfill().ffill()
    
    mask_idx = df.index%2 == 1
    
    df.loc[mask_idx, ['A', 'Max_A']] = np.NaN
    

    output

        Time         A  Max_A
    0      1  1.000000    1.0
    1      2       NaN    NaN
    2      3  1.000000    1.0
    3      4       NaN    NaN
    4      5  1.666667    2.0
    5      6       NaN    NaN
    6      7  2.666667    3.0
    7      8       NaN    NaN
    8      9  2.000000    3.0
    9     10       NaN    NaN
    10    11  1.000000    1.0
    11    12       NaN    NaN
    12    13  2.333333    3.0
    13    14       NaN    NaN
    14    15  3.000000    3.0