Search code examples
rimputation

Add missing value in column according to value of previous or next row in R


I have some longitudinal data and need to impute missing values by some rules:

  1. If a person’s first follow-up data is missing, then add the value of the next row;

  2. If a person's non-first follow-up data is missing, then add the value of the previous row;

  3. If multiple consecutive follow-up data are missing, then add the value of the previous non-missing row.

Here is an example,

dat<-data.frame(id=c(1,1,1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3),b6=c(NA,1,1,1,1,1,1,1,1,1,NA,3,NA,NA,5,5,5,5,3,NA,NA))
dat_imputed<-data.frame(id=c(1,1,1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3),b6=c(1,1,1,1,1,1,1,1,1,1,1,3,3,5,5,5,5,5,3,3,3))

Thanks for any suggestions!


Solution

  • Group by id, fill values downwards, then fill upwards. I think this is what you need.

    library(dplyr)
    library(tidyr)
    
    res <- dat %>% 
      group_by(id) %>% 
      fill(b6, .direction = "down") %>% 
      fill(b6, .direction = "up")