Search code examples
pythonimage-processingnumpyiterationlabeling

find mean position and area of labelled objects


I have a 2D labeled image (numpy array), each label represents an object. I have to find the object's center and its area. My current solution:

centers = [np.mean(np.where(label_2d == i),1) for i in range(1,num_obj+1)]

surface_area = np.array([np.sum(label_2d == i) for i in range(1,num_obj+1)])

Note that label_2d used for centers is not the same as the one for surface area, so I can't combine both operations. My current code is about 10-100 times to slow.

In C++ I would iterate through the image once (2 for loops) and fill the table (an array), from which I would than calculate centers and surface area.

Since for loops are quite slow in python, I have to find another solution. Any advice?


Solution

  • You could use the center_of_mass function present in scipy.ndimage.measurements for the first problem and then use np.bincount for the second problem. Because these are in the mainstream libraries, they will be heavily optimized, so you can expect decent speed gains.

    Example:

    >>> import numpy as np
    >>> from scipy.ndimage.measurements import center_of_mass
    >>> 
    >>> a = np.zeros((10,10), dtype=np.int)
    >>> # add some labels:
    ... a[3:5, 1:3] = 1
    >>> a[7:9, 0:3] = 2
    >>> a[5:6, 4:9] = 3
    >>> print(a)
    [[0 0 0 0 0 0 0 0 0 0]
     [0 0 0 0 0 0 0 0 0 0]
     [0 0 0 0 0 0 0 0 0 0]
     [0 1 1 0 0 0 0 0 0 0]
     [0 1 1 0 0 0 0 0 0 0]
     [0 0 0 0 3 3 3 3 3 0]
     [0 0 0 0 0 0 0 0 0 0]
     [2 2 2 0 0 0 0 0 0 0]
     [2 2 2 0 0 0 0 0 0 0]
     [0 0 0 0 0 0 0 0 0 0]]
    >>> 
    >>> num_obj = 3
    >>> surface_areas = np.bincount(a.flat)[1:]
    >>> centers = center_of_mass(a, labels=a, index=range(1, num_obj+1))
    >>> print(surface_areas)
    [4 6 5]
    >>> print(centers)
    [(3.5, 1.5), (7.5, 1.0), (5.0, 6.0)]
    

    Speed gains depend on the size of your input data though, so I can't make any serious estimates on that. Would be nice if you could add that info (size of a, number of labels, timing results for the method you used and these functions) in the comments.