I would like to select the observations in a data table that fall within intervals of time specified in a second data table - the intervals are periods of time when observations are made from 2 platforms concurrently.
The first data table looks something like this. It is a bunch of animal sightings.
obs = data.table(sighting = as.POSIXct(c("2018-08-12 16:30:00", "2018-08-12 16:35:00", "2018-08-12 16:38:00", "2107-08-13 15:13:00", "2107-08-13 16:13:00", "2017-08-14 11:12:13"), format = "%Y-%m-%d %H:%M:%OS", tz = "America/Halifax"), encounter = c("1", "1", "1", "2", "3", "4"), what = c("frog", "frog", "toad", "bird", "goat","bird"))
Observations were made from 2 platforms.
platformA = data.table(station = "A", on.effort = as.POSIXct(c("2018-08-12 16:00:00", "2018-08-12 17:35:00","2017-08-14 11:00:13", "2018-08-15 17:35:00"), format = "%Y-%m-%d %H:%M:%OS", tz = "America/Halifax"), off.effort = as.POSIXct(c("2018-08-12 16:36:00", "2018-08-12 18:35:00","2017-08-14 12:12:13", "2018-08-15 18:35:00"), format = "%Y-%m-%d %H:%M:%OS", tz = "America/Halifax"))
platformB = data.table(station = "B", on.effort = as.POSIXct(c("2018-08-12 16:15:00", "2018-08-12 17:40:00", "2018-08-13 17:40:00","2017-08-14 11:05:13"), format = "%Y-%m-%d %H:%M:%OS", tz = "America/Halifax"), off.effort = as.POSIXct(c("2018-08-12 16:40:00", "2018-08-13 17:45:00", "2018-08-12 18:20:00","2017-08-14 12:30:13"), format = "%Y-%m-%d %H:%M:%OS", tz = "America/Halifax"))
I first calculated intervals for each platform, and then intersected the intervals to find out when observations were made concurrently.
setkey(platformA, on.effort, off.effort)
setkey(platformB, on.effort, off.effort)
common = foverlaps(platformA, platformB,type="any",nomatch=0)
common$x = intersect(interval(common$on.effort, common$off.effort),
interval(common$i.on.effort, common$i.off.effort))
I'd like to end up with a table that is a subset of "obs" and contains only the rows covered by the intervals in "common$x". I had hoped to use foverlaps to find the rows in the intersected intervals, and created "point" intervals for my sightings with
obs[, sighting2 := sighting]
But foverlaps wants the "start" and "end" of each interval in separate columns, which is not how the intervals are stored in common$x.
I would like my output to look like this
sighting encounter what
2018-08-12 16:30:00 1 frog
2018-08-12 16:35:00 1 frog
2017-08-14 11:12:13 4 bird
I'd appreciate any tips. Perhaps I could have been more efficient earlier on? Thanks.
I think this should work even if you have varying observation numbers between the platforms. Using your obs
, platformA
, and platformB
data as above, make intervals for both platforms more or less as you did above in common
:
common = intersect(interval(platformA$on.effort, platformA$off.effort),
interval(platformB$on.effort, platformB$off.effort))
You should be able to use %within%
to check if there are any cases where the sighting falls within the common interval:
obs$both.seen <- sapply(obs$sighting, function(s){
any(s %within% common)
})
OR
obs[, both.seen := sapply(sighting, function(x) any(x %within% common))]
New obs
:
> obs
sighting encounter what both.seen
1: 2018-08-12 16:30:00 1 frog TRUE
2: 2018-08-12 16:35:00 1 frog TRUE
3: 2018-08-12 16:38:00 1 toad FALSE
4: 2107-08-13 15:13:00 2 bird FALSE
5: 2107-08-13 16:13:00 3 goat FALSE
6: 2017-08-14 11:12:13 4 bird TRUE
Subset to get to your desired output:
obs <- obs[both.seen == 1][, both.seen := NULL][]
> obs
sighting encounter what
1: 2018-08-12 16:30:00 1 frog
2: 2018-08-12 16:35:00 1 frog
3: 2017-08-14 11:12:13 4 bird