I'm trying to calculate overlapping depth ranges for marine species and human activities. So for each species, there's a min and max depth it occurs at, and I want to efficiently calculate the depth range the overlaps with the depth range of 4 different activities. I think this can be done with data.table::foverlaps()
or IRanges::findOverlaps()
, but I can't figure out how to calculate the value of the overlap, not just whether it's true or false. So if species D is found between 40-100m depth, and activity 1 occurs at 0-50m depth, the overlap is 10m.
For example,
min_1 <- 0
max_1 <- 50
min_2 <- 0
max_2 <- 70
min_3 <- 0
max_3 <- 200
min_4 <- 0
max_4 <- 500
activities <- data.frame(min_1, max_1, min_2, max_2, min_3, max_3, min_4, max_4)
spp_id <- c("a", "b", "c", "d")
spp_depth_min <- c(0, 20, 30, 40)
spp_depth_max <- c(200, 500, 50, 100)
species <- data.frame(spp_id, spp_depth_min, spp_depth_max)
## data.table approach?
setDT(activities)
setDT(species)
foverlaps(species, activities, ...) ## Or do I need to subset each activity and do separate calculations?
Would it be easier to write a function? I'm really unfamiliar with that! This seems like it should be a common/easy thing to do, I don't know why it's confusing me so much
I restructured your activities table into a long form so you can do all 4 calculations at once. Then the overlaps join is done, then you can calculate the overlap length from the results.
activities <- data.table(
act = c('act_1','act_2','act_3','act_4'),
a_min = c(min_1, min_2, min_3, min_4),
a_max = c(max_1, max_2, max_3, max_4)
)
spp_id <- c("a", "b", "c", "d")
spp_depth_min <- c(0, 20, 30, 40)
spp_depth_max <- c(200, 500, 50, 100)
species <- data.table(spp_id, spp_depth_min, spp_depth_max)
setkey(activities,a_min,a_max)
ol <- foverlaps(species, activities,
by.x = c('spp_depth_min','spp_depth_max'),
by.y = c('a_min','a_max')
)
ol[,ol_length := pmin(spp_depth_max,a_max)-pmax(spp_depth_min,a_min)]
ol