Search code examples
pythonimage-processingocrtesseractpython-tesseract

Recognize single characters on a page with Tesseract


enter image description here

this image returns empty string;

basically I am trying to make a bot for WOW game, but I am really new to this OCR thing. I cannot make tesseract to read this image; I want an unordered list of characters and if possible coordinates of each square containing them. Is there anyway to do this?

Thank you for your time!

here is my code:

from PIL import Image
import cv2
from pytesseract import image_to_string

column = Image.open('photo.png')
gray = column.convert('L')
blackwhite = gray.point(lambda x: 255 if x < 200 else 0, '1')
blackwhite.save("code_bw.jpg")


print(image_to_string(cv2.imread("code_bw.jpg")))

Solution

  • You need to do some preprocessing to isolate the text characters. A simple approach is to Otsu's threshold to obtain a binary image then we can find contours and filter using aspect ratio + contour area. This will give us the bounding box coordinates of the text where we can draw this onto a mask. We bitwise-and the mask with the input image to get our cleaned image then throw it into OCR. Here's the result:

    Detected text characters

    enter image description here

    Result

    enter image description here

    Result from OCR

    A
    A R
    P
    

    Code

    import cv2
    import pytesseract
    import numpy as np
    
    pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
    
    # Load image, grayscale, Otsu's threshold
    image = cv2.imread('1.jpg')
    original = image.copy()
    mask = np.zeros(image.shape, dtype=np.uint8) 
    gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)[1]
    
    # Find contours and filter using aspect ratio and area
    cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    cnts = cnts[0] if len(cnts) == 2 else cnts[1]
    for c in cnts:
        area = cv2.contourArea(c)
        x,y,w,h = cv2.boundingRect(c)
        ar = w / float(h)
        if area > 1000 and ar > .85 and ar < 1.2:
            cv2.rectangle(image, (x, y), (x + w, y + h), (36,255,12), 2)
            cv2.rectangle(mask, (x, y), (x + w, y + h), (255,255,255), -1)
            ROI = original[y:y+h, x:x+w]
    
    # Bitwise-and to isolate characters 
    result = cv2.bitwise_and(original, mask)
    result[mask==0] = 255
    
    # OCR
    data = pytesseract.image_to_string(result, lang='eng',config='--psm 6')
    print(data)
    
    cv2.imshow('image', image)
    cv2.imshow('thresh', thresh)
    cv2.imshow('result', result)
    cv2.waitKey()