Search code examples
python-3.xmachine-learningscikit-learncluster-analysisk-means

How to get N numbers of data points which are nearest from a cluster's center?


I want to get N nearest data points from center (based on Euclidean Distance) in each cluster after deploying K-means algorithm. I am able to get the indices of data points using

np.where(km.labels_ == 0)

Solution

  • You can use the transform method of the kmeans class which calculates the distance of each data point to each of the cluster.

    Then assuming you want the top N points from the 0th index cluster then you can just do:

    cluster = 0
    N = 2
    np.sort(kmeans.transform(X)[:,cluster])[:N]