Search code examples
opencvocrtesseract

Tesseract Failing on reasonably clear image


I have been trying out Tesseract OCR in combination with Open CV (EMGUCV C#) and I am trying to improve the reliability, one the whole it's been good and by apply various filters one at a time and attempting OCR (Orignal, Bilateral, AdaptiveThreshold, Dilate) I have seem significant improvement.

However...

The following image is being stubborn, despite seeming quite clear to being with, I get no results from Tesseract (orignal image before filters):

enter image description here

In this case I am simply after the 2.57


Solution

  • Instead of using filter on the image, scaling the image did helps on the OCR. Below is the code i tried. sorry i am using linux, i test with python instead of C#

    #!/usr/bin/env python3
    import argparse
    import cv2
    import numpy as np
    from PIL import Image
    import pytesseract
    import os
    from PIL import Image, ImageDraw, ImageFilter
    
    
    ap = argparse.ArgumentParser()
    ap.add_argument("-i", "--image", required=True, help="Path to the image")
    args = vars(ap.parse_args())
    img = cv2.imread(args["image"])
    
    #OCR
    barroi = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    scale_percent = 8 # percent of original size
    width = int(barroi.shape[1] * scale_percent / 100)
    height = int(barroi.shape[0] * scale_percent / 100)
    dim = (width, height)
    barroi = cv2.resize(barroi, dim, interpolation = cv2.INTER_AREA)
    
    text = pytesseract.image_to_string(barroi, lang='eng', config='--psm 10 --oem 3')
    print(str(text))
    imageName =  "Result.tif"
    cv2.imwrite(imageName, img)
    

    enter image description here