Search code examples
pythonpython-3.xopencvtesseractreinforcement-learning

Tesseract: cannot read digits from pixelated font


I would like to let my computer learn to play a game in a virtual machine, using reinforcement learning. Unfortunately I cannot read the score, which should be used for positive rewards. The font is kinda strange as well, which is probably the reason. This is my code:

def show(img):
    plt.imshow(img, cmap="gray")
    plt.show()

image = cv2.imread('screenshot.png',0)
crop_img = image[100:140, 38:280]


ret, thresh = cv2.threshold(crop_img, 127, 255, cv2.THRESH_BINARY) 
kernel = np.ones((3,3),np.uint8)
img = cv2.erode(thresh,kernel,iterations = 1)

data = pytesseract.image_to_string(img, lang='eng',config='--psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')
show(img)
print(data)

I tried to extract just the score from the screenshot, which worked out, but it doesn't seem te recognise a single character.

the score

The amount of lives, which I would like to use for negative rewards do seem to be recognised. Those are kind of strange objects, which tesseract seems to think those are Euro signs, so I could count the amount of Euro signs to determine the amount of lives...

But any tips for the score?


Solution

  • Is quite challenging to detect all the digit in the same ROI. It would be best to detect in multiple ROI. Below is what i tried.

    1. Resize the image smaller.

    2. Blur Out the digit as possible.

       barroi = cv2.cvtColor(roi, cv2.COLOR_BGR2GRAY)
       scale_percent = 50 # percent of original size
       width = int(barroi.shape[1] * scale_percent / 100)
       height = int(barroi.shape[0] * scale_percent / 100)
       dim = (width, height)
       barroi = cv2.resize(barroi, dim, interpolation = cv2.INTER_AREA)
      
       barroi = cv2.GaussianBlur(barroi,(5,5),0)
       barroi = cv2.medianBlur(barroi, 5)
       barroi = cv2.GaussianBlur(barroi,(5,5),0)
       barroi = cv2.medianBlur(barroi, 5)
       barroi = cv2.GaussianBlur(barroi,(5,5),0)
       barroi = cv2.medianBlur(barroi, 5)
       barroi = cv2.GaussianBlur(barroi,(5,5),0)
       barroi = cv2.medianBlur(barroi, 5)
       kernel = np.ones((3,3),np.uint8)
       barroi = cv2.erode(barroi,kernel,iterations = 1)
      
       (thresh, barroi) = cv2.threshold(barroi, 0, 255, cv2.THRESH_OTSU | 
       cv2.THRESH_BINARY)
       cv2.imwrite("testing.tif", barroi)
      
       text = pytesseract.image_to_string(barroi, lang='eng', config='-- 
       psm 10 --oem 3 -c tessedit_char_whitelist=0123456789')
       print(str(ROIRegion[region])+" "+str(text))
      
      
       imageName =  "Region"+str(region)+".tif"
       cv2.imwrite(imageName, roi)
      
       cv2.putText(img, "Result: "+str(text), ROIRegion[region][0], 
       cv2.FONT_HERSHEY_SIMPLEX , 0.7, (255,0,0), 2)
       imageName =  "Result.tif"
       cv2.imwrite(imageName, img)
       cv2.namedWindow('Result')
       cv2.imshow('Result',img)
      

    enter image description here

    Result

    enter image description here