Keep rows that are within specific interval for different conditions and grouped by

Here's a reprex for illustration.

library(tidyverse)

set.seed(1337)
df <- tibble(
  date_visit = sample(seq(as.Date("2020/01/01"),
    as.Date("2021/01/01"),
    by = "day"
  ), 400, replace = T),
  patient_id = as.factor(paste("patient", sample(seq(1, 13), 400, replace = T), sep = "_")),
  type_of_visit = as.factor(sample(c("medical", "veterinary"), 400, replace = T))
)

What I'm trying to do create a dataframe where I keep the patient_id (group by, I assume), and the visit types if that patient has done 2 different visits in less than 24 hours. Or adding a variable that says True/False if that condition is met.

I tried to use a left join by patient_id to work with 2 different variables but that takes too much computing time (my original DF is much longer than this)

Can someone point me in the right direction?

Thank you

Solution

Maybe this will help -

library(dplyr)

df %>%
  group_by(patient_id, date_visit) %>%
  summarise(flag = n_distinct(type_of_visit) >= 2) %>%
  summarise(flag = any(flag))

#  patient_id flag 
#   <fct>      <lgl>
# 1 patient_1  TRUE 
# 2 patient_10 FALSE
# 3 patient_11 TRUE 
# 4 patient_12 FALSE
# 5 patient_13 FALSE
# 6 patient_2  FALSE
# 7 patient_3  FALSE
# 8 patient_4  FALSE
# 9 patient_5  TRUE 
#10 patient_6  FALSE
#11 patient_7  TRUE 
#12 patient_8  TRUE 
#13 patient_9  TRUE

If you want to keep all the rows for those patient id's

df %>%
  group_by(patient_id, date_visit) %>%
  summarise(flag = n_distinct(type_of_visit) >= 2) %>%
  filter(any(flag))