Search code examples
pythonimage-processingocrtesseractpython-tesseract

Why can't Tesseract identify text that's contained in a larger box?


I am trying to extract some really obvious text from an image that's contained in a wider box:

enter image description here

However, Tesseract is not successful extracting the text from it. If I remove the box in the image, it works just fine:

enter image description here

Note, that when I change the font to something more common (e.g. Arial), it will work fine for both images. But, I do need to make it work with the current font (Impact).

Any help on how to get that to work would be hugely appreciated!

Below is my current code:

import cv2
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'

img = cv2.imread('without_box.png') #https://i.sstatic.net/vrJvd.png
img_text = pytesseract.image_to_string(img)
print('without_box : ', img_text) #returns "without_box :  TEXT"

img = cv2.imread('with_box.png') #https://i.sstatic.net/xNEdR.png
img_text = pytesseract.image_to_string(img)
print('with_box : ', img_text) #returns "with_box : "

Solution

  • For the presented kind of images1, you could automatically crop the white part which holds the text, and run pytesseract:

    import cv2
    import pytesseract
    
    
    def crop_and_detect(image):
        thr = cv2.threshold(image, 128, 255, cv2.THRESH_BINARY)[1]
        x, y, w, h = cv2.boundingRect(thr)
        return pytesseract.image_to_string(image[y:y+h, x:x+w])
    
    
    for img_file in ['vrJvd.png', 'xNEdR.png']:
        img = cv2.imread(img_file, cv2.IMREAD_GRAYSCALE)
        print(img_file, crop_and_detect(img).replace('\f', ''))
        # vrJvd.png TEXT
        #
        # xNEdR.png TEXT
        #
    
    ----------------------------------------
    System information
    ----------------------------------------
    Platform:      Windows-10-10.0.19041-SP0
    Python:        3.9.1
    PyCharm:       2021.1.2
    OpenCV:        4.5.2
    pytesseract:   5.0.0-alpha.20201127
    ----------------------------------------
    

    1 If you have an image processing related question, provide a representative set of possible input images. Otherwise, you might get a proper solution for the one or two input images you provided, but while testing that solution on your actual data set, you find out "it doesn't work", and possibly post (a lot of) follow-up questions, which could've prevented in the first place.