Search code examples
pythonimageimage-processingocrpython-tesseract

Pytesseract image_to_data not able to read the numbers in my image


So I'm currently working on a project where I use pyautogui and pytesseract to take a screenshot of the time in a video game emulator I'm using, and then to try and read the image and determine what time I got. Here's what the image looks like when I use pyautogui to get the screenshot of the region I want:

in game timer

Simply just using pytesseract.image_to_string() worked with images of text when I tested it out to make sure it was installed properly, but when I use the in game timer picture it doesn't output anything. Does this have to do with the quality of the image or some imitation with pytesseract or what?


Solution

  • You need to preprocess the image before performing OCR with Pytesseract. Here's a simple approach using OpenCV and Pytesseract OCR. The idea is to obtain a processed image where the text to extract is in black with the background in white. To do this, we can convert to grayscale, apply a slight Gaussian blur, then Otsu's threshold to obtain a binary image. We perform text extraction using the --psm 6 configuration option to assume a single uniform block of text. Take a look here for more options.


    Input image

    enter image description here

    Otsu's threshold to get a binary image

    enter image description here

    Result from Pytesseract OCR

    0’ 12”92
    

    Code

    import cv2
    import pytesseract
    
    pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
    
    # Grayscale, Gaussian blur, Otsu's threshold
    image = cv2.imread('1.png')
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    blur = cv2.GaussianBlur(gray, (3,3), 0)
    thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
    
    # Perform text extraction
    data = pytesseract.image_to_string(thresh, lang='eng', config='--psm 6')
    print(data)
    
    cv2.imshow('thresh', thresh)
    cv2.waitKey()