I have data in long format with several observations ("Visit") per individual (identified by "ID"). The number of observations per individual varies. I would like to create a new column with the last visit, which I have accomplished, and a column with the second last Visit.
My data looks like this:
ID <- c(1000,1000,1000,1001,1001,1001,1001,1002,1002,1002,1002,1002)
Visit <- c("BL","V02","V03","BL","V02","V03","V04","BL","V02","V03","V04","V05")
df <- data.frame(ID,Visit)
The lastVisit column is created by the following code:
df <- df %>%
group_by(ID) %>%
mutate(lastVisit = last(Visit))
The desired output is like this:
ID Visit lastVisit secondlastVisit
1000 BL V03 V02
1000 V02 V03 V02
1000 V03 V03 V02
1001 BL V04 V03
1001 V02 V04 V03
1001 V03 V04 V03
1001 V04 V04 V03
1002 BL V05 V04
1002 V02 V05 V04
1002 V03 V05 V04
1002 V04 V05 V04
1002 V05 V05 V04
I have tried using secondlastVisit = lag(Visit)
, but this does not give the desired output. A method using dplyr::mutate is preferred.
Thanks!
The function dplyr::nth()
does what you want; the negative index is to tell it to start from the end.
library("magrittr")
library("dplyr")
ID <- c(1000,1000,1000,1001,1001,1001,1001,1002,1002,1002,1002,1002)
Visit <- c("BL","V02","V03","BL","V02","V03","V04","BL","V02","V03","V04","V05")
df <- data.frame(ID,Visit)
df <- df %>%
group_by(ID) %>%
mutate(lastVisit = last(Visit)) %>%
mutate(secondlastVisit = nth(Visit, -2L))
df
#> # A tibble: 12 x 4
#> # Groups: ID [3]
#> ID Visit lastVisit secondlastVisit
#> <dbl> <chr> <chr> <chr>
#> 1 1000 BL V03 V02
#> 2 1000 V02 V03 V02
#> 3 1000 V03 V03 V02
#> 4 1001 BL V04 V03
#> 5 1001 V02 V04 V03
#> 6 1001 V03 V04 V03
#> 7 1001 V04 V04 V03
#> 8 1002 BL V05 V04
#> 9 1002 V02 V05 V04
#> 10 1002 V03 V05 V04
#> 11 1002 V04 V05 V04
#> 12 1002 V05 V05 V04
Created on 2023-04-13 with reprex v2.0.2