Search code examples
rperformanceif-statementdplyrlogical-or

R logical issue: longer object length is not a multiple of shorter object length ( dplyr::if_else() )


I am not sure how I can modify my code by using another method than if_else() and keep its efficiency. Here is a simple example of my original code:

library(dplyr)

# The goal is to know between which threshold belongs each record
data <- c(runif(99999), NA)
threshold <- seq(0, 1, by=0.1)
rank <- if_else(is.na(data), NA, max(which(data >= threshold))) # Error: longer object length is not a multiple of shorter object length

Thanks you


Solution

  • I think if_else isn't the right function here. Try with findInterval or cut which will find the right bucket of threshold in which your data lies.

    findInterval(data, threshold)
    

    With cut

    cut(data, threshold)
    

    Use cut with labels = FALSE if you want to get the threshold index

    cut(data, threshold, labels = FALSE)