Search code examples
pythonopencvpython-tesseract

Trying to read text from image using pytesseract but getting blank


I've taken a few pictures , and am using openCV to crop these images so i only have the relevant text . This is the picture i've taken (i.e the cropped photo): cropped image

I try to feed this image to the image_to_string function of pytesseract but when i print the output this is what i get

text from cropped image from code is '
♀ '

Any help as to how i can get the exact reading. Tried using

text2 = pytesseract.image_to_string(cropped_image) ,config='--psm 6') 

but this gives a garbage value


Solution

  • Tarun Chakitha is right, you'll need some pre-processing, thresholding, and morphological transformations to get reliable results. The following code produces Pac=2666. 1W

    # Obtain binary image
    img_bgr = cv2.imread("3CxLj.jpg")
    img_gray = cv2.cvtColor(img_bgr[90:200, 0:495], cv2.COLOR_BGR2GRAY)
    img_bin = cv2.adaptiveThreshold(
        img_gray, 255, cv2.ADAPTIVE_THRESH_MEAN_C, cv2.THRESH_BINARY, 21, 15
    )
    fig, axs = plt.subplots(3)
    axs[0].imshow(img_gray, cmap="gray")
    axs[1].imshow(img_bin, cmap="gray")
    
    # Merge dots into characters using erosion
    kernel = np.ones((5, 5), np.uint8)
    img_eroded = cv2.erode(img_bin, kernel, iterations=1)
    axs[2].imshow(img_eroded, cmap="gray")
    fig.show()
    
    # Obtain string using psm 8 (treat the image as a single word)
    ocr_string = pytesseract.image_to_string(img_eroded, config="--psm 8")
    print(ocr_string)
    

    image gray, binary, and erode