Search code examples
rtibble

Find the difference in Data based on ID in R


I would like to find out the difference in days between the two date field based on the ID number. And the last date found based on ID would be 0 (Date - itself) Example a tibble dataframe:

ID Date Difference
1 1/1/2022 2
1 3/1/2022 0
2 2/1/2022 3
2 5/1/2022 23
2 28/1/2022 0

Is there any way to achieve this?


Solution

  • You could use diff() to calculate lagged differences.

    library(dplyr)
    
    df %>%
      group_by(ID) %>%
      mutate(Date = as.Date(Date, '%d/%m/%Y'),
             Difference = c(diff(Date), 0)) %>%
      ungroup()
    
    # # A tibble: 5 × 3
    #      ID Date       Difference
    #   <int> <date>     <drtn>
    # 1     1 2022-01-01  2 days
    # 2     1 2022-01-03  0 days
    # 3     2 2022-01-02  3 days
    # 4     2 2022-01-05 23 days
    # 5     2 2022-01-28  0 days