Search code examples
pythonpandasdataframemissing-datafill

How to fill missing values based on the current values using Python?


My data is like this:

a=pd.DataFrame({'id':[0,1,2,3,4,5,6,7,8,9],
                'value':[np.nan,np.nan,0,np.nan,np.nan,1,2,np.nan,3,np.nan]})

I want to fill the missing values based on the previous known values. If there is no previous values, then fill -1. So, the result should look like:

id    value
0     -1
1     -1
2     0
3     0
4     0
5     1
6     2
7     2
8     3
9     3

My current way is to find all the known values and their positions, then scan the whole table. But there should be a better way which I am not aware of. What can I try here?


Solution

  • Use df.ffill() and fillna():

    In [1587]: a.ffill().fillna(-1)
    Out[1587]: 
       id  value
    0   0   -1.0
    1   1   -1.0
    2   2    0.0
    3   3    0.0
    4   4    0.0
    5   5    1.0
    6   6    2.0
    7   7    2.0
    8   8    3.0
    9   9    3.0