Search code examples
pythonfontsocrtesseractresolution

Tesseract OCR, reading a low-resolution/pixelated font (esp. digits)


I am trying to use Tesseract OCR v3.2 to recognize characters on a computer screen, and it is giving me a lot of trouble with a certain low-resolution font, especially when it comes to digits. The font looks like this. I am currently putting input images through a 4x upscale with a bicubic filter in Python, which results in them looking like this. Tesseract reads the processed image as "12345B?89D".

I have tried a variety of other upscale ratios (up to 1000%), as well as other image filters like lanczos, sharpen, smooth, edge enhance, and antialias. None have produced more accurate results. Anyone have ideas on how to improve recognition of this font?


Solution

  • Just tired to use your small and upscaled (x4) images feeding to Tesseract 4.0.0a. The small one gets no output even tuned the Tesseract parameters. The upscaled one is able to OCR in all the three cases tested - no further processing, grayscaled and further enhanced.

    The Tesseract used is integrated to OpenCV 3.2.0. The following is the codes.

    import cv2
    import numpy as np
    import matplotlib.pyplot as plt
    %matplotlib inline 
    
    def show(img):
        plt.imshow(img, cmap="gray")
        plt.show()
    
    def ocr(img):
        # Tesseract mode settings:
        #   Page Segmentation mode (PSmode) = 3 (defualt = 3)
        #   OCR Enginer Mode (OEM) = 3 (defualt = 3)
        tesser = cv2.text.OCRTesseract_create('C:/Program Files/Tesseract 4.0.0/tessdata/','eng','0123456789',3,3)
        retval = tesser.run(img, 0) # return string type
        print 'OCR Output: ' + retval
    
    # Directly feed image to Tesseact
    img = cv2.imread('./imagesStackoverflow/SmallDigits-x4.png')
    ocr(img)
    
    # Load image as gray scale 
    img = cv2.imread('./imagesStackoverflow/SmallDigits-x4.png',0);
    show(img)
    ocr(img)
    
    # Enhance image and get same positive result
    ret, thresh = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY) 
    kernel = np.ones((3,3),np.uint8)
    img = cv2.erode(thresh,kernel,iterations = 1)
    show(img)
    ocr(img)
    

    Input images and OCR results are here.

    enter image description here