Search code examples
pythonpandasdataframefillna

fillna on DataFrame with a simple function


I am looking for a way of filling NAs values of a DatFrame with a simple function : [row-1].value +1. The particularity of the dataframe is that it has multiple NAs one after another.

Here is an example a the kind of DataFrame I am dealing with :

import pandas as pd
import numpy as np
df = pd.DataFrame({'a':[7, 3, 12, 0, np.nan, np.nan], 
                   'b':[0, 4, 8, np.nan, np.nan, np.nan], 
                   'c':[1, 2, 1, 4, 1, 1]})

Out[7]: 
      a    b    c
0   7.0  0.0  1.0
1   3.0  4.0  2.0
2  12.0  8.0  1.0
3   0.0  NaN  4.0
4   NaN  NaN  1.0
5   NaN  NaN  1.0

Here is the output I would like to obtain :

Out[7]: 
      a     b     c
0   7.0   0.0   1.0
1   3.0   4.0   2.0
2  12.0   8.0   1.0
3   0.0   9.0   4.0
4   1.0  10.0   1.0
5   2.0  11.0   1.0

Solution

  • You can try something like this:

    import pandas as pd
    import numpy as np
    
    df=pd.DataFrame({'a':[1, 2, np.nan, np.nan, 5, np.nan, 7]})
    df
    
         a
    0  1.0
    1  2.0
    2  NaN
    3  NaN
    4  5.0
    5  NaN
    6  7.0
    
    df['a'] = df.groupby(df['a'].notnull().cumsum()).cumcount() + df['a'].ffill()
    df
    
         a
    0  1.0
    1  2.0
    2  3.0
    3  4.0
    4  5.0
    5  6.0
    6  7.0
    

    Update for your dataframe

    df = pd.DataFrame({'a':[7, 3, 12, 0, np.nan, np.nan], 
                       'b':[0, 4, 8, np.nan, np.nan, np.nan], 
                       'c':[1, 2, 1, 4, 1, 1]})
    
    df_out = df.apply(lambda x: x.groupby(x.notnull().cumsum()).cumcount() + x.ffill())
    

    Output:

          a     b  c
    0   7.0   0.0  1
    1   3.0   4.0  2
    2  12.0   8.0  1
    3   0.0   9.0  4
    4   1.0  10.0  1
    5   2.0  11.0  1