i have a dataframe df with a column containing values (meter reading). Some values are sporadically missing (NA).
df excerpt:
row time meter_reading
1 03:10:00 26400
2 03:15:00 NA
3 03:20:00 27200
4 03:25:00 28000
5 03:30:00 NA
6 03:35:00 NA
7 03:40:00 30000
What I'm trying to do:
If there is only one consecutive NA, I want to interpolate (e.g. na.interpolation for row 2). But if there's two or more consecutive NA, I don't want R to interpolate and leave the values as NA. (e.g. row 5 and 6).
What I tried so far is loop (for...) with an if-condition. My approach:
library("imputeTS")
for(i in 1:(nrow(df))) {
if(!is.na(df$meter_reading[i]) & is.na(df$meter_reading[i-1]) & !is.na(df$meter_reading[i-2])) {
na_interpolation(df$meter_reading)
}
}
Giving me :
Error in if (!is.na(df$meter_reading[i]) & is.na(df$meter_reading[i - :
argument is of length zero
Any ideas how to do it? Am I completely wrong here?
Thanks!
I don't knaow what is your na.interpolation, but taking the mean of previous and next rows for example, you could do that with dplyr :
df %>% mutate(x=ifelse(is.na(meter_reading),
(lag(meter_reading)+lead(meter_reading))/2,
meter_reading))
# row time meter_reading x
#1 1 03:10:00 26400 26400
#2 2 03:15:00 NA 26800
#3 3 03:20:00 27200 27200
#4 4 03:25:00 28000 28000
#5 5 03:30:00 NA NA
#6 6 03:35:00 NA NA
#7 7 03:40:00 30000 30000