Search code examples
rrowinterpolationnamissing-data

Substitute NA values depending of position in dataframe


I would like to substitute the NA values by a previous and posterior rows average values. Moreover, when the first or last lines are NA values I would like just the repeat next and before rows, accordingly. My real data have negative and decimals values.

My input:

1.0   NA    1.0
NA    2.0   2.0
3.0   3.0   NA

My expected output:

1.0   2.0   1.0
2.0   2.0   2.0
3.0   3.0   2.0

Cheers!


Solution

  • You could also use the na.approx function from the zoo package. Note that this has a slightly different behavior (than the solution by @flodel) when you have two consecutive NA values. For the first and last row you could then use na.locf.

    y <- na.approx(x)
    y[nrow(y), ] <- na.locf(y[(nrow(y)-1):nrow(y), ])[2, ] 
    y[1, ] <- na.locf(y[1:2,], fromLast=TRUE)[1, ] 
    

    EDIT: @Grothendieck pointed out that this was much too complicated. You can combine the entire code above into one line:

    na.approx(x, rule=2)