Select most recent date in a panel RStudio

I have a data frame which is a panel: I have different CLPs (it's like the ID of an event) repeated several times, one for each phase of the event: organization, set-up, start, end, clean-up. Each phase has a starting date and final date. I would like to create a dummy equal 1 for the most recent final date of each CLP, and if it is missing, but the starting date of the following phase is not missing and more recent than 30/04/2024, then I would like to have dummy =1 for it. Date is POSIXct format.

I provide you an example of the result i'm looking for:

| CLP       | Phase   | Starting date | Final Date | Dummy|

| -------- | -------- | ------------- | ---------- | ---- |

| A        | 1        | 01/01/2019    | 29/01/2019 | 0 |

| A        | 2        | 03/04/2020    | 04/12/2022 | 1 |

| B       | 2        | 01/05/2023    |        NA   | 0 |

| B       | 3        | 03/04/2024    |        NA   | 1 |

Thanks a lot

I grouped by CLP but then I am not able to code the date problem.

Solution

I think we need a bigger sample to proper validate this, but it works as expected with the provided data. Check it out:

library(tidyverse)

# Toy data
my_df <- structure(list(
  clp        = c("A", "A", "B", "B"), 
  phase      = c(1, 2, 2, 3), 
  start_date = structure(c(17897, 18355, 19478, 19816), class = "Date"), 
  end_date   = structure(c(17925, 19330, NA, NA),       class = "Date"), 
  dummy      = c(0, 1, 0, 1)),
  
  row.names  = c(NA, -4L), 
  class      = c("tbl_df", "tbl", "data.frame"))

my_date <- dmy("30/04/2024")

#
new_df <- my_df %>%
  mutate(
    .by = clp, 
    my_dummy = case_when(
      any(!is.na(end_date)) & end_date == max(end_date, na.rm = TRUE)        ~ 1,
      all(is.na(end_date))  & !is.na(lag(start_date)) & start_date < my_date ~ 1,
      .default = 0))

Output:

> new_df
# A tibble: 4 × 6
  clp   phase start_date end_date   dummy my_dummy
  <chr> <dbl> <date>     <date>     <dbl>    <dbl>
1 A         1 2019-01-01 2019-01-29     0        0
2 A         2 2020-04-03 2022-12-04     1        1
3 B         2 2023-05-01 NA             0        0
4 B         3 2024-04-03 NA             1        1

^{Created on 2024-05-29 with reprex v2.1.0}