Search code examples
pythonopencvpyautogui

How do I find an image on screen ignoring transparent pixels


I have a png of an asset, say:

A banana png with transparent background

and I'm trying to find it on my screen which looks something like: A game with aforementioned banana icon on non-transparent background

Normally, I would use pyautoGUI as such:

pyautogui.locateCenterOnScreen('banana.png', grayscale=True, confidence=0.9)

but it currently isn't working. It seems that the problem might be with the transparent pixels of my banana asset, which obviously aren't matched. Is there a way to do this by ignoring the transparent pixels of the banana asset and treating them as wildcards? Or another way of accomplishing this?

So far in my search, I've found this Git issue with the same issue unresolved since 2014.

Thanks!


Solution

  • In OpenCV, matchTemplate() has a masked mode. So you basically read the transparent template image as is and then extract its base image and the alpha channel. The alpha channel is used as the mask in matchTemplate(). See https://docs.opencv.org/4.1.1/df/dfb/group__imgproc__object.html#ga586ebfb0a7fb604b35a23d85391329be

    Input:

    enter image description here

    Template:

    enter image description here

    import cv2
    import numpy as np
    
    # read game image
    img = cv2.imread('game.png')
    
    # read bananas image template
    template = cv2.imread('bananas.png', cv2.IMREAD_UNCHANGED)
    hh, ww = template.shape[:2]
    
    # extract bananas base image and alpha channel and make alpha 3 channels
    base = template[:,:,0:3]
    alpha = template[:,:,3]
    alpha = cv2.merge([alpha,alpha,alpha])
    
    # do masked template matching and save correlation image
    correlation = cv2.matchTemplate(img, base, cv2.TM_CCORR_NORMED, mask=alpha)
    
    # set threshold and get all matches
    threshhold = 0.95
    loc = np.where(correlation >= threshhold)
    
    # draw matches 
    result = img.copy()
    for pt in zip(*loc[::-1]):
        cv2.rectangle(result, pt, (pt[0]+ww, pt[1]+hh), (0,0,255), 1)
        print(pt)
    
    # save results
    cv2.imwrite('bananas_base.png', base)
    cv2.imwrite('bananas_alpha.png', alpha)
    cv2.imwrite('game_bananas_matches.jpg', result)  
    
    cv2.imshow('base',base)
    cv2.imshow('alpha',alpha)
    cv2.imshow('result',result)
    cv2.waitKey(0)
    cv2.destroyAllWindows()
    

    Result:

    enter image description here