Search code examples
pythonmachine-learningscikit-learncentroid

How to find cluster centroid with Scikit-learn


I have a data set with (labeled) clusters. I'm trying to find the centroids of each cluster (a vector that his distance is the smallest from all data points of the cluster).

I found many solutions to perform clustering and only then find the centroids, but I didn't find yet for existing ones.

Python schikit-learn is preferred. Thanks.


Solution

  • Straight from the docs:

    from sklearn.neighbors.nearest_centroid import NearestCentroid
    import numpy as np
    X = np.array([[-1, -1], [-2, -1], [-3, -2], [1, 1], [2, 1], [3, 2]])
    y = np.array([1, 1, 1, 2, 2, 2])
    clf = NearestCentroid()
    clf.fit(X, y)
    
    print(clf.centroids_)
    # [[-2.         -1.33333333]
    #  [ 2.          1.33333333]]