Search code examples
rdplyrdata-manipulationdata-cleaning

Create lagged variables for several columns group by two conditions in r


I would like to create lagged variables for several columns that are grouped by two conditions.

Here is the dataset:

df <- data.frame(id = c(rep(1,4),rep(2,4)), tp = rep(1:4,2), x1 = 1:8, x2 = 2:9, x3 = 3:10, x4 = 4:11)
> df
  id tp x1 x2 x3 x4
1  1  1  1  2  3  4
2  1  2  2  3  4  5
3  1  3  3  4  5  6
4  1  4  4  5  6  7
5  2  1  5  6  7  8
6  2  2  6  7  8  9
7  2  3  7  8  9 10
8  2  4  8  9 10 11

I want to lag x1, x2, x3, x4 that are grouped by id and tp and create new variables x1_lag1, x2_lag1, x3_lag1, x4_lag1, like this:

> df
  id tp x1 x2 x3 x4  x1_lag1 x2_lag1 x3_lag1 x4_lag1
1  1  1  1  2  3  4     2       3       4       5
2  1  2  2  3  4  5     3       4       5       6
3  1  3  3  4  5  6     4       5       6       7
4  1  4  4  5  6  7     NA      NA      NA      NA
5  2  1  5  6  7  8     6       7       8       9
6  2  2  6  7  8  9     7       8       9       10
7  2  3  7  8  9 10     8       9       10      11
8  2  4  8  9 10 11     NA      NA      NA      NA

How to achieve that?


Solution

  • Your result doesn't seem to be grouped by tp at all. It is grouped by id and ordered by tp within the id grouping.

    Generally a "lag" is a variable that takes the value from the previous row. The columns you want labeled as "lag" columns take the value from the next row, so we use the lead function.

    library(dplyr)
    df %>%
      group_by(id) %>%
      mutate(across(starts_with("x"), lead, .names = "{.col}_lag1")) %>%
      ungroup()
    # A tibble: 8 × 10
         id    tp    x1    x2    x3    x4 x1_lag1 x2_lag1 x3_lag1 x4_lag1
      <dbl> <int> <int> <int> <int> <int>   <int>   <int>   <int>   <int>
    1     1     1     1     2     3     4       2       3       4       5
    2     1     2     2     3     4     5       3       4       5       6
    3     1     3     3     4     5     6       4       5       6       7
    4     1     4     4     5     6     7      NA      NA      NA      NA
    5     2     1     5     6     7     8       6       7       8       9
    6     2     2     6     7     8     9       7       8       9      10
    7     2     3     7     8     9    10       8       9      10      11
    8     2     4     8     9    10    11      NA      NA      NA      NA