Search code examples
rdplyrtidyverse

parse dynamic case_when based on thresholds values


I'd like to know if there is a way to parse a dynamic case_when expression.

Let's take as an example a vector of of 3 thresholds :

thresholds = c(0.4, 0.6, 0.8)

Based on that, I would like to assign the variable to 4 intervals possible, such as written in the following case_when :

data_test=data.frame("variable"=sample(seq(0,1,by=0.01),size=9))

data_test %>% mutate(value=case_when(
  variable <= 0.4 ~ "1",
  variable > 0.4 & variable <= 0.6 ~ "2",
  variable > 0.6 & variable <= 0.8 ~ "3",
  variable > 0.8 ~ "4"
)
)

Is there a way to do it dynamically ? Like for example if there are 5 thresholds, the number of possible intervals would be 6.
What I would like is to do it in a dynamic way based on the number of thresholds at the beginning, with for example a dynamic case_when expression that I can tidyeval after it


Solution

  • You don't need a case_when for this, base-R's cut() works well:

    cut(data_test$variable, 
         breaks = c(-Inf, thresholds, Inf), 
         labels = seq(length(thresholds)+1))
    

    If you want to stay in tidyverse,

    data_test |> mutate(value = cut(variable, ...))