Search code examples
rloopsfor-loopcutquartile

Continuous to discrete cut quartile in R from for loop


I am trying to create a bunch of columns that are quartile cuts based on multiple columns. For example,

dataset[,412:422] <- NA

for( i in 50:60){
for(j in 412:422){
     dataset[,j] <- cut(dataset[,i], 
                                      breaks=unique(quantile(dataset[,i],probs=seq(.1,1,by=.1),na.rm=T)), 
                                      include.lowest=TRUE)
    } 
}

I want to create new columns 412 to 422 based on the binning of the continuous variables from columns 50 to 60. When I try to do the above code all I get back is

   V412    V413    V414    V415    V416    V417    V418    V419 V420    V421    V422
(56,64] (56,64] (56,64] (56,64] (56,64] (56,64] (56,64] (56,64] (56,64] (56,64] (56,64]
 <NA>    <NA>    <NA>    <NA>    <NA>    <NA>    <NA>    <NA>  <NA>    <NA>    <NA>


......

<NA>    <NA>    <NA>    <NA>    <NA>    <NA>    <NA>    <NA> <NA>    <NA>    <NA>

I am not sure where I am going wrong. Any help would be greatly appreciated!!!


Solution

  • This question is more about being organized and neat with your data. There are many ways to do this.

    I would recommend separating out the data you want to bin into its own data.frame.

    x=dataset[, 50:60]
    

    then bin those columns into new columns by making a function with the parameters you want and using apply

    function:

    mycut=function(x)  cut(x, 
                           breaks=unique(quantile(x,probs=seq(.1,1,by=.1),na.rm=T)), 
                           include.lowest=TRUE)
    

    apply:

    xbin=apply(x,2,mycut)
    

    Then putting xbin back into your dataset and appropriately name it.