Search code examples
rdplyrtidyverselag

Create lagged variables for consecutive time points only using R


I have an unbalanced panel (with unequally spaced measurement points) and would like to create a lagged variable of x by group (Variable: id) but only for consecutive time points. My data looks like this:

# simple example with an unbalanced panel
base <- data.frame(id = rep(1:2, each = 7),
                  time = c(1, 2, 3, 4, 7, 8, 10, 3, 4, 6, 9, 10, 11, 14), 
                  x = rnorm(14, mean = 3, sd = 1))

I already tried this code using dplyr:

base_lag <- base %>% # Add lagged column
  group_by(id) %>%
  dplyr::mutate(lag1_x = dplyr::lag(x, n = 1, default = NA)) %>% 
  as.data.frame()
base_lag # Print updated data

However, this way I get a lagged variable regardless of the fact that in some cases it is not two consecutive time points.

My final data set should look like this:

   id time        x   lag1_x
1   1    1 3.437416       NA
2   1    2 2.300553 3.437416
3   1    3 2.374212 2.300553
4   1    4 4.374009 2.374212
5   1    7 1.177433       NA
6   1    8 1.543353 1.177433
7   1   10 3.222358       NA
8   2    3 3.763765       NA
9   2    4 3.881182 3.763765
10  2    6 4.754420       NA
11  2    9 4.518227       NA
12  2   10 2.512486 4.518227
13  2   11 3.129230 2.512486
14  2   14 2.152509       NA

Does anyone here have a tip for me on how to create this lagged variable? Many thanks in advance!


Solution

  • You could use ifelse, testing whether diff(time) is equal to 1. If so, write the lag. If not, write an NA.

    base %>%
      group_by(id) %>%
      mutate(lag1_x = ifelse(c(0, diff(time)) == 1, lag(x, n = 1, default = NA), NA)) %>% 
      as.data.frame()
    #>    id time        x   lag1_x
    #> 1   1    1 1.852343       NA
    #> 2   1    2 2.710538 1.852343
    #> 3   1    3 2.700785 2.710538
    #> 4   1    4 2.588489 2.700785
    #> 5   1    7 3.252223       NA
    #> 6   1    8 2.108079 3.252223
    #> 7   1   10 3.435683       NA
    #> 8   2    3 1.762462       NA
    #> 9   2    4 2.775732 1.762462
    #> 10  2    6 3.377396       NA
    #> 11  2    9 3.133336       NA
    #> 12  2   10 3.804190 3.133336
    #> 13  2   11 2.942893 3.804190
    #> 14  2   14 3.503608       NA