Search code examples
pythonpython-tesseract

Pytesseract Image to String issue


Does anyone know how I can get these results better?

enter image description here

Total Kills: 15,230,550

Kill Details: (recorded after 2019/10,/Z3]

993,151 331,129
1,330,450 33,265,533
5,031,168

This is what it returns however it is meant to be the same as the image posted below, I am new to python so are there any parameters that I can add to make it read the image better?

img = cv2.imread("kills.jpeg")

    text = pytesseract.image_to_string(img)

    print(text)

This is my code to read the image, Is there anything I can add to make it read better? Also, the black boxes are to cover images that were interfering with the reading. I would like to also say that I have added the 2 black boxes to see if the images behind them were causing the issue, but I still get the same issue.


Solution

  • The missing knowledge is page-segmentation-mode (psm). You need to use them, when you can't get the desired result.

    If we look at your image, the only artifacts are the black columns. Other than that, the image looks like a binary image. Suitable for tesseract to recognize the characters and the digits.

    Lets try reading the image by setting the psm to 6.

    6 Assume a single uniform block of text.

    print(pytesseract.image_to_string(img, config="--psm 6")
    

    The result will be:

    Total Kills: 75,230,550
    
    Kill Details: (recorded after 2019/10/23)
    993,161 331,129
    1,380,450 33,265,533
    5,031,168
    

    Update


    The second way to solve the problem is getting binary mask and applying OCR to the mask features.

    • Binary-mask

      • enter image description here
    • Features of the binary-mask

      • enter image description here

    As we can see the result is slightly different from the input image. Now when we apply OCR result will be:

    Total Kills: 75,230,550
    
    Kill Details: (recorded after 2019/10/23)
    993,161 331,129
    1,380,450 33,265,533
    5,031,168
    

    Code:

    import cv2
    import numpy as np
    import pytesseract
    
    # Load the image
    img = cv2.imread("LuKz3.jpg")
    
    # Convert to hsv
    hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
    
    # Get the binary mask
    msk = cv2.inRange(hsv, np.array([0, 0, 0]), np.array([179, 255, 154]))
    
    # Extract
    krn = cv2.getStructuringElement(cv2.MORPH_RECT, (5, 3))
    dlt = cv2.dilate(msk, krn, iterations=5)
    res = 255 - cv2.bitwise_and(dlt, msk)
    
    # OCR
    txt = pytesseract.image_to_string(res, config="--psm 6")
    print(txt)
    
    # Display
    cv2.imshow("res", res)
    cv2.waitKey(0)