Search code examples
matlabimage-processingcentroid

How to find the centroids of pixel clusters within a binary image


I've written some code in MATLAB that converts an image (of stars) into a binary image using a set threshold and then labels each cluster of pixels (stars) that is above this threshold. The labelling produces an output: e.g.

[1 1 1 0 0 0 0 0 0
 1 1 0 0 0 2 2 2 0
 0 0 0 3 3 0 2 0 0
 0 0 0 3 3 0 0 0 0]

So each cluster of 1's, 2's, 3's etc. represents a star. I used the answer provided at this link: How to find all connected components in a binary image in Matlab? to label the pixels. I also can't use the image processing toolbox. The code that I have so far is shown below.

How do I now find the centroids of each pixel cluster in the image?

clc
clear all
close all
img=imread('star.jpg');
binary_image=convert2binary(img);
imshow(binary_image);
visited = false(size(binary_image));
[rows, cols] = size(binary_image);
B = zeros(rows, cols);
ID_counter = 1;
for row = 1:rows
    for col = 1:cols
        if binary_image(row, col) == 0
            visited(row, col) = true;
        elseif visited(row, col)
            continue;
        else
            stack = [row col];

            while ~isempty(stack)
                loc = stack(1,:);
                stack(1,:) = [];

                if visited(loc(1),loc(2))
                    continue;
                end

                visited(loc(1),loc(2)) = true;
                B(loc(1),loc(2)) = ID_counter;

                [locs_y, locs_x] = meshgrid(loc(2)-1:loc(2)+1, loc(1)-1:loc(1)+1);
                locs_y = locs_y(:);
                locs_x = locs_x(:);

                out_of_bounds = locs_x < 1 | locs_x > rows | locs_y < 1 | locs_y > cols;
                locs_y(out_of_bounds) = [];
                locs_x(out_of_bounds) = [];
                is_visited = visited(sub2ind([rows cols], locs_x, locs_y));

                locs_y(is_visited) = [];
                locs_x(is_visited) = [];

                is_1 = binary_image(sub2ind([rows cols], locs_x, locs_y));
                locs_y(~is_1) = [];
                locs_x(~is_1) = [];

                stack = [stack; [locs_x locs_y]];
            end

            ID_counter = ID_counter + 1;
        end
    end
end

function [binary] = convert2binary(img)

    [x, y, z]=size(img);
    if z==3
        img=rgb2gray(img);
    end

    img=double(img);
    sum=0;
    for i=1:x
        for j=1:y
            sum=sum+img(i, j);
        end
    end


    threshold=100  % or sum/(x*y);
    binary=zeros(x,y);

    for i=1:x
        for j=1:y
            if img(i, j) >= threshold
                binary(i, j) = 1;
            else
                binary(i, j)=0;
            end
        end
    end
end

Solution

  • The centroid is the first order moment. It is computed by

    sum(x*v)/sum(v) , sum(y*v)/sum(v)
    

    For a binary image you can do this (I'm using a trivial loop, not vectorized code, so we can extend it later easily):

    img = [1 1 1 0 0 0 0 0 0
           1 1 0 0 0 2 2 2 0
           0 0 0 3 3 0 2 0 0
           0 0 0 3 3 0 0 0 0]; % Op's example data
    
    bin = img==1;              % A binary image
    
    % Algorithm
    sum_v = 0;
    sum_iv = 0;
    sum_jv = 0;
    for jj=1:size(bin,2)
       for ii=1:size(bin,1)
          sum_v = sum_v + bin(ii,jj);
          sum_iv = sum_iv + ii * bin(ii,jj);
          sum_jv = sum_jv + jj * bin(ii,jj);
       end
    end
    centroid = [sum_iv, sum_jv] / sum_v;
    

    You could of course iterate over each of the labels of the labeled image img, and apply the above code. But that is highly inefficient. Instead, we can loop through the image once and compute all centroids at the same time. We convert sum_v etc. into vectors, containing one running sum per label:

    N = max(img(:));     % number of labels
    sum_v = zeros(N,1);
    sum_iv = zeros(N,1);
    sum_jv = zeros(N,1);
    for jj=1:size(img,2)
       for ii=1:size(img,1)
          index = img(ii,jj);
          if index>0
             sum_v(index) = sum_v(index) + 1;
             sum_iv(index) = sum_iv(index) + ii;
             sum_jv(index) = sum_jv(index) + jj;
          end
       end
    end
    centroids = [sum_iv, sum_jv] ./ sum_v;