How to cut a vector into groups containing approximately equal number of observations in R? I also need to know what are the cutting point values, to classify future input.
So basically, I am trying to convert continuous variable into a categorical one with equal number of observations in each category. And I need to know the borders of each category. Please help.
For example:
bla <- c(1,2,3,4,5,6,7,8,9,10,11,12)
blaClass <- cut(bla, 3)
Each blaClass contains equal number of observations. But problem is that I have many observations very close to each other or even of the same value, so it's hard to divide them into groups with equal observations.
I tried using quantileCut but it gives me "breaks are not unique" error.
You may use dplyr::ntile()
to cut them into quantiles. For example,
ntile(bla,3)
[1] 1 1 1 1 2 2 2 2 3 3 3 3
will cut them by q(1/3)
and q(2/3)
equally