I understand k-means algorithms steps. However I'm not sure if the algorithm will always converge? Or can the observations always switch from one centroid to another?
The algorithm always converges (by-definition) but not necessarily to global optimum.
The algorithm may switch from centroid to centroid but this is a parameter of the algorithm (precision
, or delta
). This is sometimes refered as "cycling". The algorithm after a while cycles through centroids. There are two solutions (which both can be used at the same time). Precision
parameter, maximum number of iterations
parameter.
Precision
parameter, if centroids amount of change is less than a threshold delta
, stop the algorithm.
Max Num Iterations
, if algorithm reaches that number of iterations stop the algorithm.
Note that the above schemes do not spoil the convergence characteristics of the algorithm. it still will converge but not necessarily to global optimum (this is irrelevant of the scheme used, as in many optimisation algorithms).
You may be interested in the related question on stats.SE Cycling in k-means algorithm and a referenced proof of convergence