Let's say I have a cluster vector generated by any clustering method, like the following on the iris data:
data(iris)
kmeans_res <- kmeans(x = iris[,c(1:4)], centers = 3)
kmeans_res$cluster
Is there an efficient way to create a matrix with zeros and ones based on this vector?
The rows and the columns of this matrix are the observations from the dataset from 1 to n. And the entries should be one, if e.g. observations 5 and 8 belong to the same cluster and zero otherwise.
The problem could be solved with a loop, but this doesn't seem very elegant. Can you think of another solution?
You may use outer
.
M <- +(outer(v, v, `==`))
M[50:60, 50:60]
# [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11]
# [1,] 1 0 0 0 0 0 0 0 0 0 0
# [2,] 0 1 1 0 1 1 1 1 1 1 1
# [3,] 0 1 1 0 1 1 1 1 1 1 1
# [4,] 0 0 0 1 0 0 0 0 0 0 0
# [5,] 0 1 1 0 1 1 1 1 1 1 1
# [6,] 0 1 1 0 1 1 1 1 1 1 1
# [7,] 0 1 1 0 1 1 1 1 1 1 1
# [8,] 0 1 1 0 1 1 1 1 1 1 1
# [9,] 0 1 1 0 1 1 1 1 1 1 1
# [10,] 0 1 1 0 1 1 1 1 1 1 1
# [11,] 0 1 1 0 1 1 1 1 1 1 1
v[50:60]
# [1] 1 2 2 3 2 2 2 2 2 2 2
Data:
v <- kmeans_res$cluster