Search code examples
pythonpandasffill

Is there a way to forward fill with ascending logic in pandas / numpy?


What is the most pandastic way to forward fill with ascending logic (without iterating over the rows)?

input:

import pandas as pd
import numpy as np

df = pd.DataFrame()

df['test'] = np.nan,np.nan,1,np.nan,np.nan,3,np.nan,np.nan,2,np.nan,6,np.nan,np.nan
df['desired_output'] = np.nan,np.nan,1,1,1,3,3,3,3,3,6,6,6

print (df)

output:

    test  desired_output
0    NaN             NaN
1    NaN             NaN
2    1.0             1.0
3    NaN             1.0
4    NaN             1.0
5    3.0             3.0
6    NaN             3.0
7    NaN             3.0
8    2.0             3.0
9    NaN             3.0
10   6.0             6.0
11   NaN             6.0
12   NaN             6.0

In the 'test' column, the number of consecutive NaN's is random.

In the 'desired_output' column, trying to forward fill with ascending values only. Also, when lower values are encountered (row 8, value = 2.0 above), they are overwritten with the current higher value.

Can anyone help? Thanks in advance.


Solution

  • You can combine cummax to select the cumulative maximum value and ffill to replace the NaNs:

    df['desired_output'] = df['test'].cummax().ffill()
    

    output:

        test  desired_output
    0    NaN             NaN
    1    NaN             NaN
    2    1.0             1.0
    3    NaN             1.0
    4    NaN             1.0
    5    3.0             3.0
    6    NaN             3.0
    7    NaN             3.0
    8    2.0             3.0
    9    NaN             3.0
    10   6.0             6.0
    11   NaN             6.0
    12   NaN             6.0
    

    intermediate Series:

    df['test'].cummax()
    
    0     NaN
    1     NaN
    2     1.0
    3     NaN
    4     NaN
    5     3.0
    6     NaN
    7     NaN
    8     3.0
    9     NaN
    10    6.0
    11    NaN
    12    NaN
    Name: test, dtype: float64