Search code examples
rpanel

How to fill NA values using mean of previous and forward years in R for panel data?


How to produce the result of imputed variable? id1's 2001 is filled the mean of 2000 and 2002.

id             Year     A      imputed
1              2000     6       6
1              2001     NA      7
1              2002     8       8
1              2003     10      10
2              2000     2       2
2              2001     NA      5
2              2002     8       8
2              2003     5       5
3              2000     9       9 
3              2001     10      10
3              2002     NA      10.5
3              2003     11      12

Solution

  • library(dplyr)
    df %>%
      arrange(id,Year) %>%
      mutate(Imputed = ifelse(is.na(A), (lag(A)+lead(A))/2, A))
    

    Output is:

       id Year  A Imputed
    1   1 2000  6     6.0
    2   1 2001 NA     7.0
    3   1 2002  8     8.0
    4   1 2003 10    10.0
    5   2 2000  2     2.0
    6   2 2001 NA     5.0
    7   2 2002  8     8.0
    8   2 2003  5     5.0
    9   3 2000  9     9.0
    10  3 2001 10    10.0
    11  3 2002 NA    10.5
    12  3 2003 11    11.0
    

    #sample data
    > dput(df)
    structure(list(id = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 
    3L, 3L), Year = c(2000L, 2001L, 2002L, 2003L, 2000L, 2001L, 2002L, 
    2003L, 2000L, 2001L, 2002L, 2003L), A = c(6L, NA, 8L, 10L, 2L, 
    NA, 8L, 5L, 9L, 10L, NA, 11L)), .Names = c("id", "Year", "A"), class = "data.frame", row.names = c(NA, 
    -12L))