Am I trying to do the impossible? I want to match events in df1 with events in df2 if event2 or the period of 10 days before event2 intersects with a date in event1. I have pasted samples from the two datasets. I have looked and cannot find anything similar to this question on this forum, so maybe this is not possible. Thank you in advance!
head(df1)
# A tibble: 6 x 1
# Groups: event1 [6]
event1
<date>
1 1980-01-10
2 1980-01-13
3 1980-01-14
4 1980-02-18
5 1980-02-27
6 1980-03-02
head(df2)
event2
1 1980-01-16
2 1980-01-18
3 1980-01-19
4 1980-02-12
5 1980-09-26
6 1980-10-23
I think what I want is something like this (using the first three event2s):
ev_1 <- interval(ymd('1980-01-06'), ymd('1980-01-16'))
ev_2 <- interval(ymd('1980-01-08'), ymd('1980-01-18'))
ev_3 <- interval(ymd('1980-01-09'), ymd('1980-01-19'))
Then, I want to see if any of the event1 dates take place during an interval period. In total, I have about 60 event2 dates and hundreds of event1 dates over a 40 year period.
I was able to come up with this using instructions here, but is this the best approach? If so, is it possible to automate it so that I don't have to hand write all 60 intervals?
> dates_test <- ymd(c("1980-01-10", "1980-01-13", "1980-01-14", "1980-02-18"))
> interval_test<- list(interval(ymd('1980-01-06'), ymd('1980-01-16')),
interval(ymd('1980-01-09'), ymd('1980-01-19')))
> dates_test %within% interval_test
[1] TRUE TRUE TRUE FALSE
You can create all possible combinations of event1 and event2 then keep rows when the event2 is 10 days or less after event1.
combinations <- expand.grid(df1$event1, df2$event2)
matches <- combinations[combinations[,2] >= combinations[,1] & combinations[,2] - combinations[,1] <= 10,]
matches
Var1 Var2
1 1980-01-10 1980-01-16
2 1980-01-13 1980-01-16
3 1980-01-14 1980-01-16
7 1980-01-10 1980-01-18
8 1980-01-13 1980-01-18
9 1980-01-14 1980-01-18
13 1980-01-10 1980-01-19
14 1980-01-13 1980-01-19
15 1980-01-14 1980-01-19