I don't understand what the nstart changes in the algorithm.
If centers = 8
, that means the function will cluster 8 groups. But, what nstart variates?
This is the explanation on the documentation:
centers:
Either the number of clusters or a set of initial cluster centers. If the first, a random set of rows in x are chosen as the initial centers.
nstart:
If centers is a number, how many random sets should be chosen?
Unfortunately, the ?kmeans
doesn't exactly explain this (in both stats
and the amap
packages). But, one can get an idea by looking at the kmeans
code.
If one uses more than one random starts (nstart
greater than 1) for the kmeans
, then the algorithm returns the partition that corresponds to the smallest total within-cluster sum of squares.
(The output contain the total within-cluster sum of squares value as tot.withinss
).