Search code examples
pythonscikit-learnimputation

How can I use an imputing class to replace a value with the one on the row above?


I have the following dataframe:

|1|2|3|
------
|4|-999|6|
------
|7|8|9|

I want to replace the only the -999 value with the value from the previous row, same column. In this case the value is 2 and the dataframe would look like this:

|1|2|3|
-
|4|2|6|
-
|7|8|9|
-

I can do this using an iterative structure, but I think one of the imputing classes is better. My first choice would be KNNImputer, but, although it says that it uses values from the k-Nearest neighbors source, I don't know if it can take the value from that neighbor.

So, how can I solve this problem? Is an imputer class a good idea? Or there is a better one?


Solution

  • I would suggest setting the value (-999) to np.nan and then use fillna() with method='ffill'. It propagates the last-valid value to the NA values.

    Note, if the first element in each column is np.nan it is not filled (since there is no value before it to propagate)