python pandas feature-engineering data-preprocessing

Averaging five rows above the value in the target column

The challenge that I have, and don't know how to approach is to have averaged five, ten, or whatever amount of rows above the target value plus the target row.

Dataset

target |  A  |  B  |
----------------------
  nan  |  6  |  4  |
  nan  |  2  |  7  |  
  nan  |  4  |  9  |
  nan  |  7  |  3  |
  nan  |  3  |  7  |
  nan  |  6  |  8  |
  nan  |  7  |  6  |
   53  |  4  |  5  |
  nan  |  6  |  4  |
  nan  |  2  |  7  |
  nan  |  3  |  3  |
  nan  |  4  |  9  |
  nan  |  7  |  3  |
  nan  |  3  |  7  |
   51  |  1  |  3  |

Desired format:

target |  A  |  B  |
----------------------
   53  | 5.16|6.33 |
   51  |3.33 |5.33 |

Solution

Try this, [::-1] reversing element to order the dataframe bottom to top, so we can group the values "above" valid targets:

df.groupby(df['target'].notna()[::-1].cumsum()[::-1]).apply(lambda x: x.tail(6).mean())

Output:

        target         A         B
target                            
1         51.0  3.333333  5.333333
2         53.0  5.166667  6.333333