Search code examples
pythonpandasdataframefillna

Fillna with backwards and forward looking condition in Pandas


I am working with a dataframe that has a column with several NaN that I want to fill according to the following condition: If going backwards and forward up to 3 rows there are 2 equal values, then fill the NaN with that value.

Since this might not be clear, a couple of examples below:

  col1                    
0 10    
1 10  
2 NaN
3 NaN
4 NaN
5 10
6 5
7 NaN
8 5
9 NaN
10 NaN
11 NaN
12 NaN
  • The value in row 2 has a 10 at 1 row going back and a 10 in 3 rows going forward. --> Fill with 10
  • The value in row 3 has a 10 at 2 rows going back and a 10 in 2 rows going forward. --> Fill with 10
  • The value in row 4 has a 10 at 3 rows going back and a 10 in 1 row going forward. --> Fill with 10
  • The value in row 7 has a 5 at 1 row going back and a 5 in 1 row going forward. --> Fill with 5
  • The value in row 9 has a 5 at 1 row going back but no 5 in the 3 rows going forward. --> Then, don't fill

Then, the result would be like this:

  col1                    
0 10    
1 10  
2 10
3 10
4 10
5 10
6 5
7 5
8 5
9 NaN
10 NaN
11 NaN
12 NaN

Is there any functionality I can use to give this logic to the fillna?

Thanks!!


Solution

  • You can compare forward filling and back filling Series with limit parameter, chain mask with & for bitwise AND for only rows with missing values and replace it by forward filling column:

    m1 = df['col1'].isna()
    f = df['col1'].ffill(limit=3)
    m2 = f.eq(df['col1'].bfill(limit=3))
    
    df['col2'] = df['col1'].mask(m1 & m2, f)
    print (df)
        col1  col2
    0   10.0  10.0
    1   10.0  10.0
    2    NaN  10.0
    3    NaN  10.0
    4    NaN  10.0
    5   10.0  10.0
    6    5.0   5.0
    7    NaN   5.0
    8    5.0   5.0
    9    NaN   NaN
    10   NaN   NaN
    11   NaN   NaN
    12   NaN   NaN