Search code examples
pythonnumpyopencvfeature-extractionoverlapping

How to remove overlapping blocks from numpy array?


I'm using cv2.goodFeaturesToTrack function to find feature points in an image. The end goal is to extract square blocks of certain size, with feature points being the centers of those blocks.

However, lots of the feature points are close to each other, so the blocks are overlapping, which is not what I want.

This is an example of all feature points (centers):

array([[3536., 1419.],
       [2976., 1024.],
       [3504., 1400.],
       [3574., 1505.],
       [3672., 1453.],
       [3671., 1442.],
       [3489., 1429.],
       [3108.,  737.]])

Let's say I want to find the first n blocks with a blockRadius = 400 which are not overlapping. Any ideas on how to achieve this?


Solution

  • You'll need something iterative to do that, as recurrent dropouts like this aren't vectorizable. Something like this will work, I think

    from scipy.spatial.distance import pdist, squareform
    
    c = np.array([[3536., 1419.],
           [2976., 1024.],
           [3504., 1400.],
           [3574., 1505.],
           [3672., 1453.],
           [3671., 1442.],
           [3489., 1429.],
           [3108.,  737.]])
    
    dists = squareform(pdist(c, metric = 'chebyshev'))     # distance matrix, chebyshev here since you seem to want blocks
    indices = np.arange(c.shape[0])  # indices that haven't been dropped (all to start)
    out = [0]                        # always want the first index
    while True:
        try:
            indices = indices[dists[indices[0], indices] > 400] #drop indices that are inside threshhold
            out.append(indices[0])   # add the next index that hasn't been dropped to the output
        except:
            break   # once you run out of indices, you'll get an IndexError and you're done
    print(out)
    [0, 1]
    

    let's try with a whole bunch of points:

    np.random.seed(42)
    c = np.random.rand(10000, 2) * 800
    dists = squareform(pdist(c, metric = 'chebyshev'))     # distance matrix, checbyshev here since you seem to want squares
    indices = np.arange(c.shape[0])  # indices that haven't been dropped (all to start)
    out = [0]                        # always want the first index
    while True:
        try:
            indices = indices[dists[indices[0], indices] > 400] #drop indices that are inside threshhold
            out.append(indices[0])   # add the next index that hasn't been dropped to the output
        except:
            break   # once you run out of indices, you'll get an IndexError and you're done
    print(out, pdist(c[out], metric = 'chebyshev'))
    [0, 2, 6, 17] [635.77582886 590.70015659 472.87353138 541.13920029 647.69071411
     476.84658995]
    

    So, 4 points (makes sense since 4 400x400 blocks tile a 800x800 space with 4 tiles), mostly low values (17 << 10000) and distance between kept points is always > 400