Search code examples
pythonocrpython-tesseract

Pytesseract only gets a part of a text from the image


I am trying to use the pytesseract to detect the text from this image:

enter image description here

enter image description here

enter image description here

Specifically, I care more about detecting Commercial break in progress. I used the following code to achieve this:

from PIL import Image
import pytesseract
print(pytesseract.image_to_string(Image.open('/home/me/Desktop/dataset/my-dataset/Apex-Legends/loustreams_001.jpg')))

However, it returns:





nercial break in progress

I know I shouldn't expect SoTA result from one line of code but the text is very visible. How can I improve this?


Solution

  • You can use image_to_data to get the "Commercial break in progress" string.

    1. Convert image to rgb

    2. Get output as a dictionary

    3. If detected text pass the threshold confidence level, display.

    Code:

    import cv2
    from pytesseract import *
    
    img = cv2.imread("6cNav.jpg")
    rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    res = image_to_data(rgb, output_type=Output.DICT)
    
    out_txt = ""
    
    for i in range(0, len(res["text"])):
        x = res["left"][i]
        y = res["top"][i]
        w = res["width"][i]
        h = res["height"][i]
    
        txt = res["text"][i]
        cnf = int(res["conf"][i])
    
        if cnf > 95:
            text = "".join(txt).strip()
            cv2.rectangle(img,
                          (x, y),
                          (x + w, y + h),
                          (0, 0, 255), 2)
            cv2.putText(img,
                        text,
                        (x, y - 10),
                        cv2.FONT_HERSHEY_SIMPLEX,
                        1.2, (0, 255, 255), 3)
            out_txt += " " + text
    
    print(out_txt)
    cv2.imshow("img", img)
    cv2.waitKey(0)
    

    Output:

    enter image description here

    Commercial break in progress
    

    Please note that my pytesseract version is 4.1.1

    Updated Code


    For the all the images above, you can apply adaptive-threshold

    enter image description here

    • (1st and the 3rd image is also similar to the above)

    the result will be:

    output 1:
    Commercial loreak in progress
    
    output 2:
    Commercial break in progress
    
    output 3:
    Commercial break in progress
    

    Code:

    import cv2
    import pytesseract
    
    img1 = cv2.imread("6cNav.jpg")
    img2 = cv2.imread("IKHKa.jpg")
    img3 = cv2.imread("VTJse.jpg")
    
    img_lst = [img1, img2, img3]
    
    for i, img in enumerate(img_lst):
        gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        thr = cv2.adaptiveThreshold(gry, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
                                    cv2.THRESH_BINARY_INV, 11, 2)
    
        res = pytesseract.image_to_string(thr, config="--psm 6")
        print("output {}:".format(i+1))
        print(res)