Search code examples
rdata.tablecut

Alternative to cut function in R for data.tables - integer variables to factors


I want to convert the integer variable hp to a categorical variable, cut by 10.

mtcars[, hp_cat := cut(hp, 
    breaks = c(0, 10, 20, 30 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, Inf), 
include.lowest = TRUE )]

This yields the desired result, however it is query tedious to write out all the numbers. Is there an faster way? Also ideally the alternative would result in nicer factor names too.

Attention: I would like to have the result in data.table... so NO dplyr.


Solution

  • Just use the sequence function. Depending what the situation is you may a -Inf as the first element in the vector. Also the label parameter will allow you to assign names, this works in the code below: labels = paste0("Group",2:length(BRKS))

    BRKS <-    c( seq( 0 , 160, 10 ) , Inf )
    
    mtcars[, hp_cat := cut(hp, breaks = BRKS , include.lowest = TRUE )]