I am completing a cluster analysis on a data set and slice and dicing it in to various parts using a variety of methods, all with the goal to maximize the outcome of using kmeans to segment a large set of data. So there are 15 separate kmeans objects as results that it would be very helpful to be able to turn these in to a table to view all at once, including the title to reference the model, and key statistics about each model. Any advice? Any example of the results when you just print the results to the output section of R Studio is below. I've searched for packages etc and have not had any luck. Thanks for any help you can provide!!!
K-means clustering with 5 clusters of sizes 11356, 4621, 3380, 7455, 4381
Cluster means:
PRCT_DIFF_LOCK_ON PRCT_FRONT_PTO_ON PRCT_REAR_PTO_ON PRCT_MFWD_ON
1 0.045629787 0.0006149385 0.05848930 0.80521712
2 0.006848544 0.0036244639 0.15807745 0.06906081
3 0.390860459 0.0004615964 0.07576421 0.79353567
4 0.040412934 0.0048262841 0.11052730 0.48966547
5 0.053424999 0.0149324570 0.45581038 0.64261907
Within cluster sum of squares by cluster:
[1] 665.8571 568.6334 264.2810 554.8512 457.3876
(between_SS / total_SS = 55.5 %)
The output of kmeans
is a list
. If we want to extract the Cluster means
, use
k2$centers
Murder Assault UrbanPop Rape
1 1.004934 1.0138274 0.1975853 0.8469650
2 -0.669956 -0.6758849 -0.1317235 -0.5646433
broom
package can summarise the output in a data.frame/tibble
library(broom)
> tidy(k2)
# A tibble: 2 x 7
Murder Assault UrbanPop Rape size withinss cluster
<dbl> <dbl> <dbl> <dbl> <int> <dbl> <fct>
1 1.00 1.01 0.198 0.847 20 46.7 1
2 -0.670 -0.676 -0.132 -0.565 30 56.1 2
> glance(k2)
# A tibble: 1 x 4
totss tot.withinss betweenss iter
<dbl> <dbl> <dbl> <int>
1 196 103. 93.1 1
-reproducible example
library(cluster)
df <- USArrests
df <- na.omit(df)
df <- scale(df)
k2 <- kmeans(df, centers = 2, nstart = 25)