Search code examples
pythonpandasdataframenanfillna

different handling of nan values of a df


i have a df like this:

      A    B 
 0    1    5
 1    1    7
 2  NaN  NaN
 3    1    8
 4  NaN  NaN
 5  NaN  NaN
 6    2    6
 7    2    2
 8  NaN  NaN
 9  NaN  NaN
10    2    3

now I want to fill the nan values within an event differently than the outside. An event is identified by column A and has the same value (in my example there are events 1 and 2). The column A should simply contain the event number within an event. For column B, the last entry within an event should be transferred. Between the events the NaN values should be set to '0'.

I tried ffill() and fillna(), but can´t match it with my conditions.

Expectet outcome:

      A    B 
 0    1    5
 1    1    7
 2    1    7
 3    1    8
 4    0    0
 5    0    0
 6    2    6
 7    2    2
 8    2    2
 9    2    2
10    2    3

Thanks for the help :)


Solution

  • You can use GroupBy.ffill with helper groups created by Series.mask and compared by backfiiling values for prevent omit values between groups, last replace missing values to 0 and to integers:

    s = df.A.ffill()
    g = df.A.mask(s.eq(df.A.bfill()), s)
    df = df.groupby(g).ffill().fillna(0).astype(int)
    print (df)
        A  B
    0   1  5
    1   1  7
    2   1  7
    3   1  8
    4   0  0
    5   0  0
    6   2  6
    7   2  2
    8   2  2
    9   2  2
    10  2  3