Search code examples
rdata-extraction

How to extract value at "final visit" from a data set with repeated measurement in R?


Suppose I have a data frame with repeated measurements:

 >m
 id  age    diagnosis
  1   4         0
  1   7         1
  1   9         0
  2   6         1
  4   9         1
  4   10        0

Diagnosis is not time-invariant. How can I extract the diagnosis result at final visit (oldest age) to get something like this?

id  age    diagnosis
 1   9         0
 2   6         1
 4   10        0

Solution

  • You could try data.table's last()

    library(data.table)
    as.data.table(df)[, last(.SD), by = id]
    #    id age diagnosis
    # 1:  1   9         0
    # 2:  2   6         1
    # 3:  4  10         0
    

    Or with dplyr slice(., n())

    library(dplyr)
    slice(group_by(df, id), n())
    # Source: local data frame [3 x 3]
    # Groups: id [3]
    #
    #      id   age diagnosis
    #   (int) (int)     (int)
    # 1     1     9         0
    # 2     2     6         1
    # 3     4    10         0