python r cluster-analysis k-means word2vec

How reliable is the Elbow curve in finding K in K-Means?

So I was trying to use the Elbow curve to find the value of optimum 'K' (number of clusters) in K-Means clustering.

The clustering was done for the average vectors (using Word2Vec) of a text column in my dataset (1467 rows). But looking at my text data, I can clearly find more than 3 groups the data can be grouped into.

I read the reasoning is to have a small value of k while keeping the Sum of Squared Errors (SSE) low. Can somebody tell me how reliable the Elbow Curve is? Also if there's something I'm missing.

Attaching the Elbow curve for reference. I also tried plotting it up to 70 clusters, exploratory..

Solution

The "elbow" is not even well defined so how can it be reliable?

You can "normalize" the values by the expected dropoff from splitting the data into k clusters and it will become a bit more readable. For example, the Calinski and Harabasz (1974) variance ratio criterion. It is essentially a rescaled version that makes much more sense.