Search code examples
rtimetidyversetidyrxts

Left join and select the next observation in time in R


Suppose I have two dataframe

df <- data.frame(ID=c("Ana", "Lola", "Ana"),
             Date=c("2020-06-06", "2020-06- 06", "2020-06- 07"),
             meat=c("fish", "poultry", "poultry"),
             time_ordered=c("2020-06-06 12:24:39", "2020-06-06 12:34:36", "2020-06-07 12:24:39"))

df2 <- data.frame(ID=c("Ana","Ana",  "Lola", "Ana"),
             Date=c("2020-06-06", "2020-06-06",  "2020-06- 06", "2020-06- 07"),
             meat=c("fish", "fish", "poultry", "poultry"),
             time_received=c("2020-06-06 12:24:40", "2020-06-06 12:26:49",  "2020-06-07 12:36:39", "2020-06-07 13:04:39"))

Suppose I want to join these two dataframes on ID and meat. Then, for a given observation, I want to match time_ordered with the first time_received following it. For instance, I should have a row "ID = Ana, Data= 2020-06-06, Meat = fish, time_ordered = 2020-06-06 12:24:39, time received = 2020-06-06 12:24:40".

So I would not matched the time_received "2020-06-06 12:26:49" with anything. In fact for each (ID, Meat, time_observed), i want to match uniquely to (ID, Meat, min(time_received) > time_observed)

Thank you so much in advance!


Solution

  • Join df by df2 by ID, meat and Date, keep only the rows where time_received > time_ordered arrange the data by time_received and keep only unique rows.

    library(dplyr)
    library(lubridate)
    
    df %>%
      left_join(df2, by = c('ID', 'meat', 'Date')) %>%
      mutate(Date = ymd(Date), 
             across(c(time_ordered, time_received), ymd_hms)) %>%
      filter(time_received >  time_ordered) %>%
      arrange(ID, Date, meat, time_received) %>%
      distinct(ID, Date, meat, .keep_all = TRUE)
    
    #    ID       Date    meat        time_ordered       time_received
    #1  Ana 2020-06-06    fish 2020-06-06 12:24:39 2020-06-06 12:24:40
    #2  Ana 2020-06-07 poultry 2020-06-07 12:24:39 2020-06-07 13:04:39
    #3 Lola 2020-06-06 poultry 2020-06-06 12:34:36 2020-06-07 12:36:39