I'm writing a function, that will divide my data into clusters. Each cluster should be a factor level. How can I combine neighbouring factor levels into one? In the end, I want to have the factor labels, that show me the lowest and highest number in the cluster. For example if I have the data:
data <- c(1,2,1,1,2,4,2,3,3,2,4,3,2)
data2 <- as.factor(data)
So this will make a factor with 4 levels. Let's say, I want to combine the 2nd and the 3rd level. The only thing that I can think of is using the cut() function:
data2 <- cut(data, breaks=c(0,1,3,4))
which gives me a factor with levels "(0,1]" "(1,3]" "(3,4]". And now I'd like to combine the "(1,3]" and "(3,4]" into a level "(1,4]". How can I do that? Is it possible to do it just with using data2? I know I could do it with using cut() on data, but when I'll have a lot of data, the clustering might get messy.
Just do
levels(data2)[2:3] <- '(1,4]'
data2
#[1] (0,1] (1,4] (0,1] (0,1] (1,4] (1,4] (1,4] (1,4] (1,4] (1,4] (1,4] (1,4]
#[13] (1,4]
#Levels: (0,1] (1,4]