Search code examples
rperformanceloopsapplyintervals

Check whether elements of vectors are inside intervals given by matrix


Actually a really nice problem to which I came up with a solution (see below), which is, however, not beautiful:

Assume you have a vector x and a matrix A which contains the start of an interval in the first column and the end of the interval in the second.
How can I get the elements of A, which fall into the intervals given by A?

x <- c(4, 7, 15)

A <- cbind(c(3, 9, 14), c(5, 11, 16))

Expected output:

[1] 4 15

You could you the following information, if this would be helpful for increasing the performance:
Both, the vector and the rows of the matrix are ordered and the intervals don't overlap. All intervals have the same length. All numbers are integers, but can be huge.

Now I did not want to be lazy and came up with the following solution, which is too slow for long vectors and matrices:

x <- c(4, 7, 15)  # Define input vector

A <- cbind(c(3, 9, 14), c(5, 11, 16))  # Define matrix with intervals

b <- vector()

for (i in 1:nrow(A)) {
  b <- c(b, A[i, 1]:A[i, 2])
}

x[x %in% b]

I know that loops in R can be slow, but I did not know how to write the operation without one (maybe there is a way with apply).


Solution

  • We can use sapply to loop over each element of x and find if it lies in the range of any of those matrix values.

    x[sapply(x, function(i) any(i > A[, 1] & i < A[,2]))]
    #[1]  4 15
    

    In case, if length(x) and nrow(A) are same then we don't even need the sapply loop and we can use this comparison directly.

    x[x > A[, 1] & x < A[,2]]
    #[1]  4 15