Search code examples
tidymodelsr-recipes

Does Tidymodel's Recipes provide functions to manipulate logical variables?


I am looking through the documentation on Recipes and I find step_string2factor and step_num2factor but I can find no equivalent step_logical2factor. I'm building a classification model and need to convert my T/F outcome to a factor. Does that have to be done in a preprocessing step? If so, why (isn't the point of recipes to provide preprocessing functionality?)?


Solution

  • You can make step_num2factor() work by using step_mutate() with as.numeric.

    library(recipes)
    library(dplyr)
    
    mtcars0 <- mtcars %>%
      mutate(across(c(vs, am), as.logical))
    
    recipe(~., data = mtcars0) %>%
      step_mutate(across(c(vs, am), as.numeric)) %>%
      step_num2factor(vs, am, transform = function(x) x + 1, levels = c("FALSE", "TRUE")) %>%
      prep() %>%
      bake(new_data = NULL)
    #> # A tibble: 32 × 11
    #>      mpg   cyl  disp    hp  drat    wt  qsec vs    am     gear  carb
    #>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <fct> <fct> <dbl> <dbl>
    #>  1  21       6  160    110  3.9   2.62  16.5 FALSE TRUE      4     4
    #>  2  21       6  160    110  3.9   2.88  17.0 FALSE TRUE      4     4
    #>  3  22.8     4  108     93  3.85  2.32  18.6 TRUE  TRUE      4     1
    #>  4  21.4     6  258    110  3.08  3.22  19.4 TRUE  FALSE     3     1
    #>  5  18.7     8  360    175  3.15  3.44  17.0 FALSE FALSE     3     2
    #>  6  18.1     6  225    105  2.76  3.46  20.2 TRUE  FALSE     3     1
    #>  7  14.3     8  360    245  3.21  3.57  15.8 FALSE FALSE     3     4
    #>  8  24.4     4  147.    62  3.69  3.19  20   TRUE  FALSE     4     2
    #>  9  22.8     4  141.    95  3.92  3.15  22.9 TRUE  FALSE     4     2
    #> 10  19.2     6  168.   123  3.92  3.44  18.3 TRUE  FALSE     4     4
    #> # … with 22 more rows
    

    The feature you are suggesting has been requested before but has not been implemented yet at the time of this answer.