I have the following data:
Account | date | type |
---|---|---|
1 | 2021-08-31 | 0 |
1 | 2021-09-23 | 0 |
1 | 2021-09-30 | 5 |
1 | 2021-10-30 | 0 |
1 | 2021-12-29 | 0 |
1 | 2022-01-31 | 8 |
1 | 2022-02-02 | 0 |
I need to find the minimum date of each individual transition.
group_by(Account, type) %>%
summarise(first_appearance = min(date))
returns
Account | date | type |
---|---|---|
1 | 2021-08-31 | 0 |
1 | 2021-09-30 | 5 |
1 | 2022-01-31 | 8 |
How can I group by each SHIFT in type?
My initial thoughts are to generate some sort of sequence along the factors and concatenate to have a unique grouping variable, but how would this be done?
Account | date | type | order | type_order |
---|---|---|---|---|
1 | 2021-08-31 | 0 | A | 0A |
1 | 2021-09-23 | 0 | A | 0A |
1 | 2021-09-30 | 5 | A | 5A |
1 | 2021-10-30 | 0 | B | 0B |
1 | 2021-12-29 | 0 | B | 0B |
1 | 2022-01-31 | 8 | A | 8A |
1 | 2022-02-02 | 0 | C | 0C |
Desired output would be:
group_by(Account, type_order) %>%
summarise(first_appearance = min(date))
Account | date | type | order | type_order |
---|---|---|---|---|
1 | 2021-08-31 | 0 | A | 0A |
1 | 2021-09-30 | 5 | A | 5A |
1 | 2021-10-30 | 0 | B | 0B |
1 | 2022-01-31 | 8 | A | 8A |
1 | 2022-02-02 | 0 | C | 0C |
Maybe use rleid
from data.table
to assign groups when there are differences in type
from row to row.
library(tidyverse)
library(data.table)
df %>%
group_by(Account, grp = rleid(type), type, order) %>%
summarise(first_appearance = min(date))
Output
Account grp type order first_appearance
<int> <int> <int> <chr> <chr>
1 1 1 0 A 2021-08-31
2 1 2 5 A 2021-09-30
3 1 3 0 B 2021-10-30
4 1 4 8 A 2022-01-31
5 1 5 0 C 2022-02-02