Search code examples
rggplot2k-means

Multiple Kmeans Clustering and plotting using facetwrap


I would like to ask how to automatize multiple K-means clustering for same dataset - I want to create multiple kmean clusters when number of clusters will be changing and then plot result using facet_wrap

So one can eye ball it what number of cluster seems to be most propriate.

I am able to do so it the code is very By-hand - can it be somehow automatize:


library(tidyverse)
Y <- mtcars %>% select(hp, disp)

kme1 <- kmeans(Y, 3)
kme2 <- kmeans(Y, 4)
kme3 <- kmeans(Y, 5)
kme4 <- kmeans(Y, 6)

A <- broom::augment(kme1, Y) %>% 
  mutate(num_clust = 3)
B <- broom::augment(kme2, Y) %>% 
  mutate(num_clust = 4)
C <- broom::augment(kme3, Y) %>% 
  mutate(num_clust = 5)
D <- broom::augment(kme4, Y) %>% 
  mutate(num_clust = 6)

rbind(A, B, C, D) %>% 
  ggplot(aes(hp, disp)) + 
  geom_point(aes(color = .cluster)) + 
  stat_ellipse(aes(x=hp,y=disp,fill=factor(.cluster)),
               geom="polygon", level=0.95, alpha=0.2) + 
  facet_wrap(~num_clust)


Solution

  • You can use purrr::map and variants:

    library(tidyverse)
    Y <- mtcars %>% select(hp, disp)
    
    map(set_names(3:6), ~kmeans(Y, .x)) %>% 
      map(broom::augment, Y) %>% 
      imap(~mutate(.x, num_clust = .y)) %>% 
      bind_rows() %>% 
      ggplot(aes(hp, disp)) + 
      geom_point(aes(color = .cluster)) + 
      stat_ellipse(aes(x=hp,y=disp,fill=factor(.cluster)),
                   geom="polygon", level=0.95, alpha=0.2) + 
      facet_wrap(~num_clust)