I am trying to calculate optimum number of clusters in R. I am using the following code
library(factoextra)
library(tidyverse)
library(cluster)
wsCustomer <- read.csv(url("https://archive.ics.uci.edu/ml/machine-learning-databases/00292/Wholesale customers data.csv"))
#Converting Region and Channel columns ; replacing values by names
wsCustomer <- wsCustomer %>% mutate(Channel = ifelse(Channel == 1 , "HoReCa","Retail"),
Region = case_when(Region == 1 ~ "Lisbon",
Region == 2 ~ "Oporto",
Region == 3 ~ "Others"))
head(wsCustomer)
df <- as_tibble(scale(wsCustomer[3:8]))
# compute gap statistic
set.seed(123)
gap_stat <- clusGap(df, FUN = kmeans, nstart = 25,
K.max = 10, B = 50)
It is giving me the following warning
Warning message: did not converge in 10 iterations
How to get rid of this warning message?
An answer from here
It means that the partition obtained is not stable (i.e. the algorithm did not converge toward an optimal solution). Indeed, a supplementary iteration will modify it significantly.
Default iter.max parameter is set to 10, which in your case it's not enough so just increase it by setting iter.max parameter:
gap_stat <- clusGap(df, FUN = kmeans, nstart = 25, K.max = 10, B = 50,iter.max=30)