Search code examples
rdplyrmutate

How to create a new colum that identifies the last and second last row in longitudinal data using dplyr


I have data in long format with several observations ("Visit") per individual (identified by "ID"). The number of observations per individual varies. I would like to create a new column with the last visit, which I have accomplished, and a column with the second last Visit.

My data looks like this:

ID <- c(1000,1000,1000,1001,1001,1001,1001,1002,1002,1002,1002,1002)
Visit <- c("BL","V02","V03","BL","V02","V03","V04","BL","V02","V03","V04","V05")
df <- data.frame(ID,Visit)

The lastVisit column is created by the following code:

df <- df %>% 
  group_by(ID) %>% 
  mutate(lastVisit = last(Visit))

The desired output is like this:

ID Visit lastVisit secondlastVisit       
1000 BL    V03       V02             
1000 V02   V03       V02             
1000 V03   V03       V02             
1001 BL    V04       V03             
1001 V02   V04       V03            
1001 V03   V04       V03             
1001 V04   V04       V03             
1002 BL    V05       V04             
1002 V02   V05       V04             
1002 V03   V05       V04             
1002 V04   V05       V04             
1002 V05   V05       V04 

I have tried using secondlastVisit = lag(Visit), but this does not give the desired output. A method using dplyr::mutate is preferred.

Thanks!


Solution

  • The function dplyr::nth() does what you want; the negative index is to tell it to start from the end.

    library("magrittr")
    library("dplyr")
    
    ID <- c(1000,1000,1000,1001,1001,1001,1001,1002,1002,1002,1002,1002)
    Visit <- c("BL","V02","V03","BL","V02","V03","V04","BL","V02","V03","V04","V05")
    df <- data.frame(ID,Visit)
    
    df <- df %>% 
      group_by(ID) %>% 
      mutate(lastVisit = last(Visit)) %>%
      mutate(secondlastVisit = nth(Visit, -2L))
    
    df
    #> # A tibble: 12 x 4
    #> # Groups:   ID [3]
    #>       ID Visit lastVisit secondlastVisit
    #>    <dbl> <chr> <chr>     <chr>          
    #>  1  1000 BL    V03       V02            
    #>  2  1000 V02   V03       V02            
    #>  3  1000 V03   V03       V02            
    #>  4  1001 BL    V04       V03            
    #>  5  1001 V02   V04       V03            
    #>  6  1001 V03   V04       V03            
    #>  7  1001 V04   V04       V03            
    #>  8  1002 BL    V05       V04            
    #>  9  1002 V02   V05       V04            
    #> 10  1002 V03   V05       V04            
    #> 11  1002 V04   V05       V04            
    #> 12  1002 V05   V05       V04
    

    Created on 2023-04-13 with reprex v2.0.2