I would like to find out the difference in days between the two date field based on the ID number. And the last date found based on ID would be 0 (Date - itself) Example a tibble dataframe:
ID | Date | Difference |
---|---|---|
1 | 1/1/2022 | 2 |
1 | 3/1/2022 | 0 |
2 | 2/1/2022 | 3 |
2 | 5/1/2022 | 23 |
2 | 28/1/2022 | 0 |
Is there any way to achieve this?
You could use diff()
to calculate lagged differences.
library(dplyr)
df %>%
group_by(ID) %>%
mutate(Date = as.Date(Date, '%d/%m/%Y'),
Difference = c(diff(Date), 0)) %>%
ungroup()
# # A tibble: 5 × 3
# ID Date Difference
# <int> <date> <drtn>
# 1 1 2022-01-01 2 days
# 2 1 2022-01-03 0 days
# 3 2 2022-01-02 3 days
# 4 2 2022-01-05 23 days
# 5 2 2022-01-28 0 days