Search code examples
pythonpandasdataframerowffill

Dataframe fill column with previous value until condition


I have a dataframe that looks like this:

Step Text Parameter
15 print 1
16 control 2
17 printout 3
18 print2 1
19 Nan 2
20 Nan 3
21 Nan 4
22 Nan 1
23 Nan 2
24 Nan 1

And I want my dataframe to look like this:

Step Text Parameter
15 print 1
15 print 2
15 print 3
16 control 1
16 control 2
17 control 3
17 control 4
18 printout 1
18 printout 2
19 print2 1

So basically when I have "1" in Parameter column, I need the next value from Step and Text. Any ideas?:)


Solution

  • You can use repeat on a custom group:

    # ensure NaN
    df['Text'] = df['Text'].replace('Nan', pd.NA)
    
    # get the number of rows per group starting with 1
    n = df.groupby(df['Parameter'].eq(1).cumsum()).size()
    
    # repeat the index of the non NaN values as many times
    idx = df['Text'].dropna().index.repeat(n)
    
    # replace the values ignoring the index
    # (using the underlying numpy array)
    df[['Step', 'Text']] = df.loc[idx, ['Step', 'Text']].to_numpy()
    

    output:

      Step      Text  Parameter
    0   15     print          1
    1   15     print          2
    2   15     print          3
    3   16   control          1
    4   16   control          2
    5   16   control          3
    6   16   control          4
    7   17  printout          1
    8   17  printout          2
    9   18    print2          1