Search code examples
pythonpython-tesseract

pytesseract can't read text


I followed tutorial on pytesseract and tried different config but i can't get pytesseract to read a basic stopsign image

heres my code

import cv2
import pytesseract
from PIL import Image


img = cv2.imread('gamepictures/STOPSIGN.jpg')
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
print(pytesseract.image_to_string(img))

img = Image.open('/home/fook/Documents/pygame/opencv/gamepictures/STOPSIGN.jpg')
rgbimg = Image.new('RGBA', img.size)
rgbimg.paste(img)
text = pytesseract.image_to_string(rgbimg)
print(text)

def print_text():
    print(pytesseract.image_to_string(Image.open('/home/fook/Documents/pygame/opencv/gamepictures/STOPSIGN.jpg')))


print_text()

my output is three music note. When i change the config from 1 to 11 in image to string, i sometimes have a @:

and my image is stopsign


Solution

  • You can use adaptive thresholding.

    Adaptive thresholding determines the threshold based on a small region. Therefore gives better results for images with varying illumination.

    If you apply adaptive thresholding:

    enter image description here

    If you read using psm --6:

    STOP)
    

    For this example the input image dimensions exceed my screen size, therefore I resized the image. The code is:

    import cv2
    import pytesseract
    
    img = cv2.imread("GO8nU.jpg")  # Load the image
    img = cv2.resize(img, (0, 0), fx=0.25, fy=0.25)
    img = cv2.cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)  # convert to grey
    img = cv2.adaptiveThreshold(img, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 3, 15)
    txt = pytesseract.image_to_string(img, config='--psm 6')
    print(txt)
    cv2.imshow("", img)
    cv2.waitKey(0)
    

    You need to test block size and C parameters manually. I mean the parameters provided in this example may not work for others.

    From the source:

    The blockSize determines the size of the neighbourhood area and C is a constant that is subtracted from the mean or weighted sum of the neighbourhood pixels.