python image-processing ocr tesseract python-tesseract

Why can't Tesseract identify text that's contained in a larger box?

I am trying to extract some really obvious text from an image that's contained in a wider box:

However, Tesseract is not successful extracting the text from it. If I remove the box in the image, it works just fine:

Note, that when I change the font to something more common (e.g. Arial), it will work fine for both images. But, I do need to make it work with the current font (Impact).

Any help on how to get that to work would be hugely appreciated!

Below is my current code:

import cv2
import pytesseract

pytesseract.pytesseract.tesseract_cmd = r'C:\\Program Files\\Tesseract-OCR\\tesseract.exe'

img = cv2.imread('without_box.png') #https://i.sstatic.net/vrJvd.png
img_text = pytesseract.image_to_string(img)
print('without_box : ', img_text) #returns "without_box :  TEXT"

img = cv2.imread('with_box.png') #https://i.sstatic.net/xNEdR.png
img_text = pytesseract.image_to_string(img)
print('with_box : ', img_text) #returns "with_box : "

Solution

For the presented kind of images¹, you could automatically crop the white part which holds the text, and run pytesseract:

import cv2
import pytesseract


def crop_and_detect(image):
    thr = cv2.threshold(image, 128, 255, cv2.THRESH_BINARY)[1]
    x, y, w, h = cv2.boundingRect(thr)
    return pytesseract.image_to_string(image[y:y+h, x:x+w])


for img_file in ['vrJvd.png', 'xNEdR.png']:
    img = cv2.imread(img_file, cv2.IMREAD_GRAYSCALE)
    print(img_file, crop_and_detect(img).replace('\f', ''))
    # vrJvd.png TEXT
    #
    # xNEdR.png TEXT
    #

----------------------------------------
System information
----------------------------------------
Platform:      Windows-10-10.0.19041-SP0
Python:        3.9.1
PyCharm:       2021.1.2
OpenCV:        4.5.2
pytesseract:   5.0.0-alpha.20201127
----------------------------------------

¹ _{^{If you have an image processing related question, provide a representative set of possible input images. Otherwise, you might get a proper solution for the one or two input images you provided, but while testing that solution on your actual data set, you find out "it doesn't work", and possibly post (a lot of) follow-up questions, which could've prevented in the first place.}}