set.seed(1)
DATA = data.frame(X = sample(c(0:100), 1000, replace = TRUE))
DATA$CUT = with(DATA, cut(X, breaks = c(10,20,30,40,50,60,70,80,90), right = FALSE))
I wish to get groups: 0-9, 10-19, 20-29,..,80-89, 90+
but no matter how I do cut function I do not get these breaks.
You need to include the extreme bounds. For example
breaks <- c(0,10,20,30,40,50,60,70,80,90, Inf)
DATA <- transform(DATA, CUT=cut(X, breaks=breaks, right = FALSE))
which results in
table(DATA$CUT)
# [0,10) [10,20) [20,30) [30,40) [40,50) [50,60) [60,70) [70,80) [80,90) [90,Inf)
# 102 84 96 102 96 102 90 94 122 112
Since cut()
usually expects continuous values and not counts, if you have integers, [0,10)
is the same as [0,9]
or 0-9
If you want to set the labels, you can do
breaks <- c(0,10,20,30,40,50,60,70,80,90, Inf)
labels <- paste(head(breaks, -1), tail(breaks, -1)-1, sep="-")
DATA <- transform(DATA, CUT=cut(X, breaks=breaks, labels=labels, right = FALSE))
which now results in
table(DATA$CUT)
# 0-9 10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89 90-Inf
# 102 84 96 102 96 102 90 94 122 112