Search code examples
pythonocrscikit-image

how to order the regions in regionprops by area?


i am doing some OCR whit python, in order to get the coordinates of the letters in an image, i take the centroid of a region(returned by the regionprops from skimage.measure) and if a distance between one centroid vs the others centroids is less than some value, i drop that region, i though this would solve the problem of several regions one inside the others but i missed that if a region with less area is detected first(like just a part of a letter) all the bigger regions (that may contain the whole letter) are ignored, here is my code

 centroids = []
    for region in regionprops(label_image):
       if len(centroids) == 0:
           centroids.append(region.centroid[1])
           do some stuff...
       if len(centroids) != 0:
           distances = []
           for centroid in centroids:
              distance = abs(centroid - region.centroid[1])
              distances.append(distance)
           if all(i >= 0.5 * region_width for i in distances):
              do some stuff...
           centroids.append(region.centroid[1])

now the questions here is if there is a way to order the list returned by regionprops by area? and how to do it?, or if you can give a better way to avoid the problem of a region inside another regions, thanks in advance


Solution

  • The Python built-in sorted() takes a key= argument, a function by which to sort, and a reversed= argument to sort in decreasing order. So you can change your loop to:

        for region in sorted(
                regionprops(label_image),
                key=lambda r: r.area,
                reverse=True,
        ):
    

    To check whether one region is completely contained in another, you can use r.bbox, and check whether one box is inside another, or overlaps it.

    Finally, if you have a lot of regions, I recommend you build a scipy.spatial.cKDTree with all the centroids before running your loop, as this will make it much faster to check whether a region is close to existing ones.