Search code examples
pythonimage-processingscikit-imageconnected-components

Calculating just a specific property in regionprops python


I am using the measure.regionprops method available in scikit-image to measure the properties of the connected components. It computes a bunch of properties (Python-regionprops). However, I just need the area of each connected component. Is there a way to compute just a single property and save computation?


Solution

  • There seems to be a more direct way to do the same thing using regionprops with cache=False. I generated labels using skimage.segmentation.slic with n_segments=10000. Then:

    rps = regionprops(labels, cache=False)
    [r.area for r in rps]
    

    My understanding of the regionprops documentation is that setting cache=False means that the attributes won't be calculated until they're called. According to %%time in Jupyter notebook, running the code above took 166ms with cache=False vs 247ms with cache=True, so it seems to work.

    I tried an equivalent of the other answer and found it much slower.

    %%time
    ard = np.empty(10000, dtype=int)
    for i in range(10000):
       ard[i] = size(np.where(labels==0)[1])
    

    That took 34.3 seconds.

    Here's a full working example comparing the two methods using the skimage astronaut sample image and labels generated by slic segmentation:

    import numpy as np
    import skimage
    from skimage.segmentation import slic
    from skimage.data import astronaut
    
    img = astronaut()
    # `+ 1` is added to avoid a region with the label of `0`
    # zero is considered unlabeled so isn't counted by regionprops
    # but would be counted by the other method.
    segments = slic(img, n_segments=1000, compactness=10) + 1
    
    # This is just to make it more like the original poster's 
    # question.
    labels, num = skimage.measure.label(segments, return_num=True)
    

    Calculate areas using the OP's suggested method with index values adjusted to avoid the having a zero label:

    %%time
    area = {}
    for i in range(1,num + 1):
        area[i + 1] = np.size(np.where(labels==i)[1])
    

    CPU times: user 512 ms, sys: 0 ns, total: 512 ms Wall time: 506 ms

    Same calculation using regionprops:

    %%time
    rps = skimage.measure.regionprops(labels, cache=False)
    area2 = [r.area for r in rps]
    

    CPU times: user 16.6 ms, sys: 0 ns, total: 16.6 ms Wall time: 16.2 ms

    Verify that the results are all equal element-wise:

    np.equal(area.values(), area2).all()
    

    True

    So, as long as zero labels and the difference in indexing is accounted for, both methods give the same result but regionprops without caching is faster.