Search code examples
rdategrouping

Get date last condition by group


Consider this data frame:

   idPerson idAppt decision date      
 1 A             1 a        2021-09-10
 2 A             1 b        2021-09-11
 3 A             1 c        2021-09-12
 4 A             1 d        2021-09-13
 5 A             2 a        2021-09-20
 6 A             2 b        2021-09-21
 7 A             3 a        2021-09-10
 8 A             3 b        2021-09-11
 9 B             1 a        2021-09-10
10 B             1 b        2021-09-11
11 B             1 c        2021-09-12
12 B             1 d        2021-09-13
13 B             2 a        2021-09-11
14 B             2 b        2021-09-12
15 B             3 a        2021-09-14
16 B             3 b        2021-09-15

For each groups of idPerson and idAppt, I'd like to get a date2 column, but with certain conditions:

  • For any groups (idPerson x idAppt) whose decision == "a" starts later than the date of decision == "d" of any other idAppt group with the same idPerson, report the date when decision == "d" of that group.
  • For any other group that do not meet this requirement, date2 should be the earliest date for this given idPerson.

Which yields this data frame:

   idPerson idAppt decision date       date2     
 1 A             1 a        2021-09-10 2021-09-10
 2 A             1 b        2021-09-11 2021-09-10
 3 A             1 c        2021-09-12 2021-09-10
 4 A             1 d        2021-09-13 2021-09-10
 5 A             2 a        2021-09-20 2021-09-13
 6 A             2 b        2021-09-21 2021-09-13
 7 A             3 a        2021-09-10 2021-09-10
 8 A             3 b        2021-09-11 2021-09-10
 9 B             1 a        2021-09-10 2021-09-10
10 B             1 b        2021-09-11 2021-09-10
11 B             1 c        2021-09-12 2021-09-10
12 B             1 d        2021-09-13 2021-09-10
13 B             2 a        2021-09-11 2021-09-10
14 B             2 b        2021-09-12 2021-09-10
15 B             3 a        2021-09-14 2021-09-13
16 B             3 b        2021-09-15 2021-09-13

Data:

df <- structure(list(idPerson = c("A", "A", "A", "A", "A", "A", "A", 
"A", "B", "B", "B", "B", "B", "B", "B", "B"), idAppt = c(1L, 
1L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L), 
    decision = c("a", "b", "c", "d", "a", "b", "a", "b", "a", 
    "b", "c", "d", "a", "b", "a", "b"), date = structure(c(18880, 
    18881, 18882, 18883, 18890, 18891, 18880, 18881, 18880, 18881, 
    18882, 18883, 18881, 18882, 18884, 18885), class = "Date")), class = c("tbl_df", 
"tbl", "data.frame"), row.names = c(NA, -16L))

EO <- structure(list(idPerson = c("A", "A", "A", "A", "A", "A", "A", 
"A", "B", "B", "B", "B", "B", "B", "B", "B"), idAppt = c(1L, 
1L, 1L, 1L, 2L, 2L, 3L, 3L, 1L, 1L, 1L, 1L, 2L, 2L, 3L, 3L), 
    decision = c("a", "b", "c", "d", "a", "b", "a", "b", "a", 
    "b", "c", "d", "a", "b", "a", "b"), date = structure(c(18880, 
    18881, 18882, 18883, 18890, 18891, 18880, 18881, 18880, 18881, 
    18882, 18883, 18881, 18882, 18884, 18885), class = "Date"), 
    date2 = c("2021-09-10", "2021-09-10", "2021-09-10", "2021-09-10", 
    "2021-09-13", "2021-09-13", "2021-09-10", "2021-09-10", "2021-09-10", 
    "2021-09-10", "2021-09-10", "2021-09-10", "2021-09-10", "2021-09-10", 
    "2021-09-13", "2021-09-13")), row.names = c(NA, -16L), class = c("tbl_df", 
"tbl", "data.frame"))

Solution

  • df %>%
      group_by(idPerson) %>%
      mutate(d_date = min(date[decision == "d"]), min_date_person = min(date)) %>%
      group_by(idPerson, idAppt) %>%
      mutate(date2 = if_else(date[decision == "a"] > d_date, d_date, min_date_person)) %>%
      ungroup()
    # # A tibble: 16 × 7
    #    idPerson idAppt decision date       d_date     min_date_person date2     
    #    <chr>     <int> <chr>    <date>     <date>     <date>          <date>    
    #  1 A             1 a        2021-09-10 2021-09-13 2021-09-10      2021-09-10
    #  2 A             1 b        2021-09-11 2021-09-13 2021-09-10      2021-09-10
    #  3 A             1 c        2021-09-12 2021-09-13 2021-09-10      2021-09-10
    #  4 A             1 d        2021-09-13 2021-09-13 2021-09-10      2021-09-10
    #  5 A             2 a        2021-09-20 2021-09-13 2021-09-10      2021-09-13
    #  6 A             2 b        2021-09-21 2021-09-13 2021-09-10      2021-09-13
    #  7 A             3 a        2021-09-10 2021-09-13 2021-09-10      2021-09-10
    #  8 A             3 b        2021-09-11 2021-09-13 2021-09-10      2021-09-10
    #  9 B             1 a        2021-09-10 2021-09-13 2021-09-10      2021-09-10
    # 10 B             1 b        2021-09-11 2021-09-13 2021-09-10      2021-09-10
    # 11 B             1 c        2021-09-12 2021-09-13 2021-09-10      2021-09-10
    # 12 B             1 d        2021-09-13 2021-09-13 2021-09-10      2021-09-10
    # 13 B             2 a        2021-09-11 2021-09-13 2021-09-10      2021-09-10
    # 14 B             2 b        2021-09-12 2021-09-13 2021-09-10      2021-09-10
    # 15 B             3 a        2021-09-14 2021-09-13 2021-09-10      2021-09-13
    # 16 B             3 b        2021-09-15 2021-09-13 2021-09-10      2021-09-13