Search code examples
pythonimagenumpymultidimensional-arrayscikit-image

Facing problem while using for loop in python for applying skimage.filters.threshold_otsu() to a list of segments in an image/array


My problem is explained clearly below :

import numpy as np
from skimage.util.shape import view_as_blocks
from skimage.filters import threshold_otsu

I have read a grey scale image as an array "arr" whose shape is :

>>arr.shape

(10000, 15200)

I segmented that image in to 25*38 blocks. where, each block has 400*400 pixels in it using :

>>img= view_as_blocks(arr, block_shape=(400,400))
>>img.shape
(25, 38, 400, 400)

Now when I use threshold_otsu() to find the threshold value for each segment separately I am getting these values :

print(threshold_otsu(img[11,6],16))
-14.606459

print(threshold_otsu(img[11,7],16))
-15.792943

print(threshold_otsu(img[11,11],16))
-15.547393

print(threshold_otsu(img[12,16],16))
-16.170353

But when I am using for loop to get all the threshold values at once I am getting different values.

>>crdf
array([[11,  6],
       [11,  7],
       [11, 11],
       [12, 16],
       [10,  9],
       [21, 26],
       [15, 15],
       [12, 17],
       [12, 12],
       [14, 10],
       [20, 26]], dtype=int64) 

>>for i in range(0,4):
    >>print(threshold_otsu(img[crdf[i]],16))

-14.187654
-14.187654
-14.187654
-13.238304

Did I do any mistake in the for loop ? If no why I am getting different threshold values when I do individually for each segment and when I use for loop for iteration for same corresponding segments.


Solution

  • When you use a Numpy array to index into another Numpy array, you trigger advanced indexing, which works differently from basic indexing and thus produces a result you didn't expect.

    In order to use basic indexing, you need to ensure you index with comma-separated integers or standard Python tuples.

    Therefore a quick fix is to convert your indices to a tuple:

    img[tuple(crdf[i])]
    

    But it's likely better not to use a Numpy array for the list of indices. If you don't perform mathematical operations on your indices in bulk or your list of indices is short, you gain nothing in performance by making it an array and maybe even lose some through the overhead of creating the initial array and converting it back to tuple later.

    Try a simple list of pairs of indices, where the pairs themselves are tuples:

    crdf = [
        (11,  6),
        (11,  7),
        (11, 11),
        (12, 16),
        (10,  9),
        (21, 26),
        (15, 15),
        (12, 17),
        (12, 12),
        (14, 10),
        (20, 26)
    ]
    
    # then you can iterate and index directly, without conversion:
    
    for i in range(0, 4):
        print(threshold_otsu(img[crdf[i]], 16))
    
    # or even like this, which is usually more idiomatic and easier to read:
    
    for pair in crdf[:4]:
        print(threshold_otsu(img[pair], 16))
    

    If you received crdf as a result from some other processing and it was already a Numpy array, you can also convert it to a list of tuples in one go first:

    crdf = [tuple(pair) for pair in crdf]