Search code examples
matlabdistancek-meanscentroid

How to calculate distance from cluster centroids from the outputs of Matlab kmean() function


I have 2 output clusters from k means matlab function
[idx,C] = kmeans(X,2);

I don't know how to calculate the distance between centroid and each point in cluster by using "idx"

I want to get matrix with all points that their distance to centroid >2

% not Matlab code; just illustrating concept    

example c1->{x1,x2}= x1-c1=3 x2-c1=2

c2->{y1,y2}=
y1-c2=4
y2-c2=1

output={y1,x1}

Solution

  • Try it this way:

    Update The answer now uses loops.

    r = randn(300,2)*5;
    r(151:end,:) = r(151:end,:) + 15;
    
    n_clusters = 2;
    
    [idx, C] = kmeans(r, n_clusters);
    
    clusters = cell(n_clusters, 1);
    distances = cell(n_clusters, 1);
    for ii = 1:n_clusters
        clusters{ii} = r(idx==ii, :);
        distances{ii} = sqrt(sum((clusters{ii}-C(ii,:)).^2,2));    
    end
    
    figure;
    subplot(1,2,1);   
    for ii = 1:n_clusters
        plot(clusters{ii}(:,1), clusters{ii}(:,2), '.');
        hold on
        plot(C(ii,1), C(ii,2), 'ko','MarkerFaceColor', 'w');
    end
    
    title('Clusters and centroids');
    
    subplot(1,2,2);
    
    for ii = 1:n_clusters
        plot(clusters{ii}(distances{ii} > 2,1), clusters{ii}(distances{ii} > 2,2), '.');
        hold on
        plot(C(ii,1), C(ii,2), 'ko','MarkerFaceColor', 'w');
    end
    title('Centroids and points with distance > 2');
    

    enter image description here

    To get a a cell with matrices with the points larger than 2, you can do:

    distant_points = cell(n_clusters,1);
    for ii = 1:n_clusters
        distant_points{ii} = clusters{ii}(distances{ii} > 2,:)
    end