Search code examples
r

Cut function in R is not creating the desired number of intervals


My range of values is pretty tight but I want the cut function to force the intervals, even if it means making the interval range quite narrow. But it is not doing this.

temp <- c(8.32, 8.43, 8.41, 7.86, 7.98, 7.86, 8.07, 8.51, 7.92, 7.94, 8.36)
bins = 3
labels = c("Low","Medium","High")

categories <- cut(temp, breaks=bins, labels=labels)

> categories
 [1] High High High Low  Low  Low  Low  High Low  Low  High
Levels: Low Medium High

But if I ask it to cut into 4 intervals, I do get more of a range:

bins = 4
labels = c("Low","Medium","High", "Very High")

categories <- cut(temp, breaks=bins, labels=labels)

> categories
 [1] High      Very High Very High Low       Low       Low       Medium    Very High Low       Low       Very High
Levels: Low Medium High Very High

How can I get my 3 interval range to include some values as "medium"?


Solution

  • 1) quantile Try cutting on quantiles

    bins <- length(labels)
    cut(temp, breaks = quantile(temp, 0:bins / bins), labels = labels,
      include.lowest = TRUE)
    
    
    ##  [1] Medium High   High   Low    Medium Low    Medium High   Low    Low   
    ## [11] High  
    ## Levels: Low Medium High
    

    2) quantcut or try quantcut from gtools

    library(gtools)
    quantcut(temp, q = length(labels), labels = labels)
    
    ##  [1] Medium High   High   Low    Medium Low    Medium High   Low    Low   
    ## [11] High  
    ## Levels: Low Medium High