Search code examples
rdateintervalsdate-arithmetic

Test if date occurs in multiple date ranges with R


I have a data frame with multiple date ranges (45 to be exact):

Range  Start       End
1      2014-01-01  2014-02-30
2      2015-01-10  2015-03-30
3      2016-04-20  2016-10-12
...    ...         ...

They will never overlap

I also have a data frame with various event dates (200K+):

Event  Date
1      2014-01-02
2      2014-03-20
3      2015-04-01
4      2016-08-18
...    ...

I want to test if these dates fall within any of these ranges:

Event  Date        InRange
1      2014-01-02  TRUE
2      2014-03-20  FALSE
3      2015-04-01  FALSE
4      2016-08-18  TRUE
...

What is the best way to perform this test? I have looked at lubridate's between and interval functions as well as various Stackoverflow questions, but cannot find a good solution.


Solution

  • You can create a vector of your date range from the first data frame, then use %in% operator to check if each date of your events is in this date range. Assuming your first data frame is dateRange, and second events, putting the above logic in one line would be:

    events$InRange <- events$Date %in% unlist(Map(`:`, dateRange$Start, dateRange$End))
    
    events
      Event       Date InRange
    1     1 2014-01-02    TRUE
    2     2 2014-03-20   FALSE
    3     3 2015-04-01   FALSE
    4     4 2016-08-18    TRUE
    

    Where we used the Map to create the date range vector. Map combined with : operator create a list of date range from the Start to the End. Somewhere close to list(2014-01-01 : 2014-02-30, 2015-01-10 : 2015-03-30, 2016-04-20 : 2016-10-12 ...)(symbolically, not valid), with the unlist, we flatten it as a vector of date range which could then be used with %in% conveniently.