Search code examples
rdplyrlubridate

How to find if a date is within a given time interval


I have the following data frame with dates.

ID   start_date      end_date      Intrvl                    a_date           b_date          c_date
1     2013-12-01     2014-05-01    2013-12-01--2014-05-01    2014-01-01       2014-03-10      2015-03-10       
2     2016-01-01     2016-07-01    2016-01-01--2016-07-01    2014-02-01       NA              2016-02-01
3     2014-01-01     2014-07-01    2014-01-01--2014-07-01    2014-02-01       2016-02-01      2014-07-01    

I want to know,

  1. if the dates from columns a_date, b_date and c_date are within the interval period that I have calculated using lubridate:: interval (start_date, end_date). In real I have a data frame with 400 columns.

  2. The names of date columns if the dates are within the corresponding interval. Like the output below

    ID  Within_Intrvl
    1   a_b  
    2   a  
    3   a_c
    

I have read the answers of this question [link], but did not help me. Thank you!


Solution

  • Assuming your data is already converted with lubridate,

    input<- df %>%
      mutate(start_date=ymd(start_date)) %>%
      mutate(end_date=ymd(end_date)) %>%
      mutate(a_date=ymd(a_date)) %>%
      mutate(b_date=ymd(b_date)) %>%
      mutate(c_date=ymd(c_date)) %>%
      mutate(Intrvl=interval(start_date, end_date)) 
    

    you could use the %within% operator in lubridate

    result <- input %>%
      mutate(AinIntrvl=if_else(a_date %within% Intrvl,"a","")) %>%
      mutate(BinIntrvl=if_else(b_date %within% Intrvl,"b","")) %>%
      mutate(CinIntrvl=if_else(c_date %within% Intrvl,"c","")) %>%
      mutate(Within_Intrvl=paste(AinIntrvl,BinIntrvl,CinIntrvl,sep="_")) %>%
      select(-start_date,-end_date,-Intrvl,-a_date,-b_date,-c_date )
    

    You can format the Within_Intrvl column as you like, and well as decide how you want to deal with NAs