I am trying to create a bunch of columns that are quartile cuts based on multiple columns. For example,
dataset[,412:422] <- NA
for( i in 50:60){
for(j in 412:422){
dataset[,j] <- cut(dataset[,i],
breaks=unique(quantile(dataset[,i],probs=seq(.1,1,by=.1),na.rm=T)),
include.lowest=TRUE)
}
}
I want to create new columns 412 to 422 based on the binning of the continuous variables from columns 50 to 60. When I try to do the above code all I get back is
V412 V413 V414 V415 V416 V417 V418 V419 V420 V421 V422
(56,64] (56,64] (56,64] (56,64] (56,64] (56,64] (56,64] (56,64] (56,64] (56,64] (56,64]
<NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
......
<NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
I am not sure where I am going wrong. Any help would be greatly appreciated!!!
This question is more about being organized and neat with your data. There are many ways to do this.
I would recommend separating out the data you want to bin into its own data.frame.
x=dataset[, 50:60]
then bin those columns into new columns by making a function with the parameters you want and using apply
function:
mycut=function(x) cut(x,
breaks=unique(quantile(x,probs=seq(.1,1,by=.1),na.rm=T)),
include.lowest=TRUE)
apply:
xbin=apply(x,2,mycut)
Then putting xbin back into your dataset and appropriately name it.