Search code examples
pythongroupingocrsimilarity

Identifying similar items in Python to reduce / clarify number of CV results


After running two different CV-algorithms over an image (to find multiple occurences of a detail) they both deliver a list of results like the list below.

First algorithm delivers (tuples with left, top, width height):

[(816, 409, 35, 39), (817, 409, 35, 39), (818, 409, 35, 39), (815, 410, 35, 39), (816, 410, 35, 39), (817, 410, 35, 39), (818, 410, 35, 39), (819, 410, 35, 39), (816, 411, 35, 39), (817, 411, 35, 39), (818, 411, 35, 39), (816, 447, 35, 39), (817, 447, 35, 39), (818, 447, 35, 39), (815, 448, 35, 39), (816, 448, 35, 39), (817, 448, 35, 39), (818, 448, 35, 39), (816, 449, 35, 39), (817, 449, 35, 39), (818, 449, 35, 39), (856, 639, 35, 39), (857, 639, 35, 39), (858, 639, 35, 39), (855, 640, 35, 39), (856, 640, 35, 39), (857, 640, 35, 39), (858, 640, 35, 39), (859, 640, 35, 39), (856, 641, 35, 39), (857, 641, 35, 39), (858, 641, 35, 39)]

Output of second (CV2) algorithm is (coordinates of upper left corner):

[(816, 409), (817, 409), (818, 409), (815, 410), (816, 410), (817, 410), (818, 410), (819, 410), (816, 411), (817, 411), (818, 411), (816, 447), (817, 447), (818, 447), (815, 448), (816, 448), (817, 448), (818, 448), (819, 448), (816, 449), (817, 449), (818, 449), (856, 639), (857, 639), (858, 639), (855, 640), (856, 640), (857, 640), (858, 640), (859, 640), (856, 641), (857, 641), (858, 641)]

But on the screen there exist only three occurrences of the searched item. Looking closely you can see that - for example - the first two entries are very similar (816 instead of 817 on the left-position).

The CV2 code looks like this:

# detect image in image 
img_rgb = open_cv_image  # original image
template = cv2.imread('C:/temp/detail.png') # searching this!
w, h = template.shape[:-1]

res = cv2.matchTemplate(img_rgb, template, cv2.TM_CCOEFF_NORMED)
threshold = .8
loc = np.where(res >= threshold)
a = []
for pt in zip(*loc[::-1]):  
    cv2.rectangle(img_rgb, pt, (pt[0] + w, pt[1] + h), (0, 0, 255), 2)
    a.append(pt)
#cv2.imwrite('result.png', img_rgb)
print(a)

So both approaches do not deliver EXACT results, but a diffuse list of similar results. My questions to be solved are:

1. How to find out HOW MANY items are really found (grouping the results)?

2. How to reduce the list to one item in each group (doesn't matter which one as there are all similar)?

Is there an easy way to group similar tuples / lists in Python and reduce it to the essential items? Or is there any simple CV-Mechanism for Python that gives exact matches? Any help is appreciated...

Thanks in advance! Ulrich


Solution

  • You can have a list of every tuples with a euclidean_distance greater than 1 (or what else) from the previous one with a simple list comprehension. However you will need to insert a zero-values tuple at the beginning. If s is your input list then

    s.insert(0, (0,0,0,0))
    t = [s[x] for x in range(1,len(s)) if euclidean_distance(s[x],s[x-1]) > 1]
    
    >>> t
    [(816, 409, 35, 39), (815, 410, 35, 39), (816, 411, 35, 39), (816, 447, 35, 39), (815, 448, 35, 39), (816, 449, 35, 39), (856, 639, 35, 39), (855, 640, 35, 39), (856, 641, 35, 39)]