The algorithm for the K-means++ is:
D(X(i))^2/Sum(D(X(j))^2)
from j=1 to m, where D(X(i))
is the distance between the instance and the closest centroid which is selected.What is this parameter m
used in the summation of the probability?
It might have been helpful to see the original formulation, but the algorithm is quite clear: during the initialization phase, for each point not used as a centroid, calculate the distance between said point and the nearest centroid, that will be the distance D(X[i]), the pick a random point in this set of points with probability weighted with D(X[i])^2
In your formulation it seems you got m points unused.