Search code examples
pythonopencvtesseract

How to improve Tesseract's output


I have an image that looks like this:original

And this is the processed image enter image description here

I have tried pretty much everything. I processed the image like this:

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #Converting to GrayScale
(h, w) = gray.shape[:2]
gray = cv2.resize(gray, (w*2, h*2))
thresh = cv2.threshold(gray, 150, 255.0, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
gray = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, rectKernel)
blur  = cv2.GaussianBlur(gray,(1,1),cv2.BORDER_DEFAULT)
text = pytesseract.image_to_string(blur, config="--oem 1 --psm 6")

But Tesseract doesnt print out anything. I am using this version of tesseract 5.0.0-alpha.20201127

How do I improve it's performance? Its highly unreliable. Edit:

The answer below did a wonderful job on the said image. But when I apply this technique to image like this one I get wrong output enter image description here

enter image description here

Why is that? They seem roughly the same.


Solution

  • The problem is characters are not in center of the image.

    Sometimes, tesseract have difficulty recognizing the characters or digit if they are not on the center.

    Therefore my suggestion is:

      1. Center the characters
      1. Up-sample and convert to gray-scale

      1. Centering the characters:

        • enter image description here

        • cv2.copyMakeBorder(img, 50, 50, 50, 50, cv2.BORDER_CONSTANT, value=[255])
          
        • 50 is just a padding variable, you can set to any other value.

        • The background turns blue because of the value. OpenCV read the image in BGR fashion. giving 255 as an input is same as [255, 0, 0] which is display blue channel, but not green and red respectively.

        • You can try with other values. For me it won't matter, since I'll convert it to gray-scale on the next step.

      1. Up-sampling and converting to gray-scale:

        • The same steps you have done. The first three-line of your code.

        • enter image description here

    Now when you read:

    MEHVISH MUQADDAS
    

    Code:


    import cv2
    import pytesseract
    
    # Load the image
    img = cv2.imread("onf0D.jpg")
    
    # Center the image
    img = cv2.copyMakeBorder(img, 50, 50, 50, 50, cv2.BORDER_CONSTANT, value=[255])
    
    # Up-sample
    img = cv2.resize(img, (0, 0), fx=2, fy=2)
    
    # Convert to gray-scale
    gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    
    # OCR
    txt = pytesseract.image_to_string(gry, config="--psm 6")
    print(txt)
    

    Read more tesseract-improve-quality.

    You don't need to do threshold, GaussianBlur or morphologyEx.

    The reasons are:

    • Simple-Threshold is used to get the features of the image. Input images' features are already available.

    • You don't have to smooth the image, there is no illumination effect on the image.

    • You don't need to do segmentation, since background is plain-white.


    Update-1

    The second image requires pre-processing. However, applying simple-threshold won't work on this image. You need to remove the background using a binary mask, then you can apply OCR.

    • Result of the binary-mask:

    • enter image description here

    Now, if you apply OCR:

    IRUM FEROZ
    

    Code:


    import cv2
    import numpy as np
    import pytesseract
    
    # Load the image
    img = cv2.imread("jCMft.jpg")
    
    # Center the image
    img = cv2.copyMakeBorder(img, 50, 50, 50, 50, cv2.BORDER_CONSTANT, value=[255])
    
    # Up-sample
    img = cv2.resize(img, (0, 0), fx=2, fy=2)
    
    # Convert to HSV color-space
    hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
    
    # Adaptive-Threshold
    msk = cv2.inRange(hsv, np.array([0, 0, 0]), np.array([179, 255, 130]))
    
    # OCR
    txt = pytesseract.image_to_string(msk, config="--psm 6")
    print(txt)
    

    Q:How do I find the lower and upper bounds of the cv2.inRange method?

    A: You can use the following script.

    Q: What did you change in the second image?

    A: First I converted image to the HSV format, instead of gray-scale. The reason is I wanted remove the background. If you experiment with adaptiveThreshold you will see there are a lot of artifacts on the background limits the tesseract recognition. Then I used cv2.inRange to get a binary mask. Feeding binary-mask to the input gave me the desired result.