Search code examples
pythonocrtesseract

Improve text reading from image


I am trying to read movie credits from a movie. To make a MVP I started with a picture:enter image description here

I use this code:

print(pytesseract.image_to_string(cv2.imread('frames/frame_144889.jpg')))

I tried different psm but it return an ugly text.

one Swimmer
Decay
Nurse
Aer
a
ig
coy
Coy
cor
ag
Or
Rr
Sa
Ae
Red
cod
Reng
OED Ty
Ryan Stunt Double
UST
er ey a er
Pm
JESSICA NAPIER
ALEX MALONE
Ey
DAMIEN STROUTHOS
JESSE ROWLES
DARIUS WILLIAMS
beamed
Aya
GEORGE HOUVARDAS
Sih
ata ARS Vara
BES liv4
MIKE DUNCAN
Pe
OV TN Ia
Ale Tate
SUV (aa: ae
SU aa
AIDEN GILLETT
MARK DUNCAN.

I tried with other picture with bigger resolution with better result but I which to be able to enable non HD movie.

What could I do to improve the precision of the reading ?

Regards Quentin


Solution

  • I achieve good results very often just following this guideline to improve Tesseract accuracy: Tesseract - Improving the quality of the output

    Important things to do are:

    • Use white for the background and black for characters font color.
    • Select desired tesseractpsm mode. In this case, use psm mode 6 to treat image as a single uniform block of text.
    • Use tessedit_char_whitelist config to specify only the characters that you are sarching for. In this case, all minor and major characters of english alphabeth.

    Here is the code:

    import cv2
    import numpy as np
    import pytesseract
    
    pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract'
    img = cv2.imread('a.jpg')
    grayImage = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    (_, blackWhiteImage) = cv2.threshold(grayImage, 127, 255, cv2.THRESH_BINARY_INV)
    blackWhiteImage = cv2.copyMakeBorder(src=blackWhiteImage, top=100, bottom=100, left=50, right=50, borderType=cv2.BORDER_CONSTANT, value=(255,255,255))
    data = pytesseract.image_to_data(blackWhiteImage, config="-c tessedit_char_whitelist= ABCDEFGHIJKLMNOabcdefghijklmnopqrstuvwxyz --psm 6")
    originalImage = cv2.cvtColor(blackWhiteImage, cv2.COLOR_GRAY2BGR)
    
    text = []
    for z, a in enumerate(data.splitlines()):
        if z != 0:
            a = a.split()
            if len(a) == 12:
                x, y = int(a[6]), int(a[7])
                w, h = int(a[8]), int(a[9])
                cv2.rectangle(originalImage, (x, y), (x + w, y + h), (0, 255, 0), 1)
                cv2.putText(originalImage, a[11], (x, y - 2), cv2.FONT_HERSHEY_DUPLEX, 0.5, (0, 0, 255), 1)
                text.append(a[11]);
    
    print("Text result: \n", text)
    cv2.imshow('Image result', originalImage)
    cv2.waitKey(0)
    

    And the image with the expected result:

    enter image description here