Search code examples
rfunctionmultiple-columnscut

How to put multiple breaks in cut function in R?


I have 3 columns that I want to put labels depending in 3 differents breaks for each column like in the example but I don't know how because I can label multiple columns for the same break but not for differents breaks (br1, br2, br3).

var <- 1:10

x1 <- rnorm(10, mean=100, sd=25)
x2 <- rnorm(10, mean=100, sd=25)
x3 <- rnorm(10, mean=100, sd=25)
df <- data.frame(var,x1,x2,x3)

#With 1 break for all the columns
br1 <-c(50,60,70,80,90,100,110,120,130,140)
df2 <-data.frame(lapply(df[, 2:4], cut, br=br1, labels=c(1:9)))

The problem: how can I use the following breaks(#breaks 2 and 3) for column x2 and x3 in the same code or any idea so each column(x1, x2, x3) have a break (br1, br2, br3)?

#breaks 2 and 3
br2 <-c(30,40,45,55,61,70,98,105,115,138)
br3<-c(20,25,30,35,38,42,45,70,95,132)

Solution

  • You can use Map:

    Map(function(x, y, labels=1:9) cut(x, y, labels = labels), df[, 2:4], list(br1, br2, br3))
    

    The output is a list of columns of df. It can be converted to a data frame using as.data.frame. You can also add other parameters to cut (e.g., include_lowest). Values outside of the intervals provided are NAs.

    # OUTPUT
    $x1
     [1] 6    8    <NA> 8    9    4    5    <NA> 6    8   
    Levels: 1 2 3 4 5 6 7 8 9
    
    $x2
     [1] 8 6 6 5 3 7 8 6 9 8
    Levels: 1 2 3 4 5 6 7 8 9
    
    $x3
     [1] 8    9    <NA> 9    9    9    <NA> 9    8    8   
    Levels: 1 2 3 4 5 6 7 8 9
    

    Data

    set.seed(123)
    var <- 1:10
    
    x1 <- rnorm(10, mean=100, sd=25)
    x2 <- rnorm(10, mean=100, sd=25)
    x3 <- rnorm(10, mean=100, sd=25)
    df <- data.frame(var,x1,x2,x3)
    
    #With 1 break for all the columns
    br1 <-c(50,60,70,80,90,100,110,120,130,140)
    br2 <-c(30,40,45,55,61,70,98,105,115,138)
    br3<-c(20,25,30,35,38,42,45,70,95,132)