Search code examples
algorithmcluster-analysisk-means

What does it mean for the k-means algorithm to be trapped in a local minimum?


I am learning about the k-means clustering algorithm. I have read that one of the characteristics of this algorithm is that it can get trapped in a local minimum, and that a simple way to increase the chance of finding a global optimum is to restart the algorithm with different random seeds. I understand the basic concept of the algorithm, which initialises arbitrary centroids/means in the first iteration and then assigns data points to these clusters. The centroids are then updated after the points are all assigned, and points are re-assigned again. The algorithm continues to iterate until the clusters do not change anymore.

However, I am having trouble understanding exactly what is meant by a local minimum in the context of this algorithm. Any insights are appreciated.


Solution

  • The k-means algorithm is an iterative method which converges to some configuration such that the assignments of the points to the centers do not change anymore, and stays in that configuration. There are, in fact, many such “attractor” configurations, each corresponding to some value of the “fitting error” (total distance of the points to the centers).

    If you plot the fitting error as a function of the configurations (positions of the centers), this function will have many local minima and you need a lot of luck to land in the global one.

    The following graphic visualizes the problem:

    local_minimum_visualization

    The left plot shows the “fitting error” that you want to minimize. The target global minimum that you want to reach is the red point in the right plot. But because every blue point is also a local minimum, it is likely for the algorithm to get stuck in one of these points.