Search code examples
pythonpandasdataframedata-preprocessing

How do I fill na values in a column with the average of previous non-na and next non-na value in pandas?


Raw table:

Column A
5
nan
nan
15

New table:

Column A
5
10
10
15

Solution

  • One option might be the following (using fillna twice (with options ffill and bfill) and then averaging them):

    import pandas as pd
    import numpy as np
    
    df = pd.DataFrame({'x': [np.nan, 5, np.nan, np.nan, 15]})
    filled_series = [df['x'].fillna(method=m) for m in ('ffill', 'bfill')]
    print(pd.concat(filled_series, axis=1).mean(axis=1))
    # 0     5.0
    # 1     5.0
    # 2    10.0
    # 3    10.0
    # 4    15.0
    

    As you can see, this works even if nan happens at the beginning or at the end.