I am a little stuck here and could use some help. I am trying to interpolate some missing data in a time series, but a lot of my cases (countries) only have few observations and are often not consistent. So I am trying to interpolate beween the first observation and the last in for each country. How do I do that if there are some NAs left after the last observation in a country that I do not want to be interpolated?
library("tidyverse")
library("imputeTS")
data <- data.frame(country = c(1, 1, 1, 1, 2, 2, 2, 3, 3, 3),
time = c(1990, 1991, 1992, 1993, 1990, 1991, 1992, 1990, 1991, 1992),
value = c(5, 6, 7, NA, 5, NA, 7, 5, 6, 7))
print(data)
data %>% group_by(country) %>%
mutate(int = na_interpolation(value))
I would like the value for 1993 in country 1 to remain NA. Its probably simple, but I cant wrap my head around it.
Try using the na.approx function in the "zoo" package.
library("zoo")
data %>% group_by(country) %>% mutate(int = na.approx(value, na.rm=FALSE))
Hope this is what you are looking for, This would keep the NA in country 1 as NA.