Search code examples
pythonimageopencvimage-processingtesseract

How to preprocess this image for Tesseract?


I’ve been trying to find a way to process this image for a long time, it’s very bad quality, clearly below 300ppi, I’m trying to process it with blur and threshold. Image

All that I did, is it possible to work with this image?

img = cv2.imread(img_path, 0)

img = cv2.GaussianBlur(img, (3, 3), 0)
_, threshold = cv2.threshold(img, 65, 255, cv2.THRESH_BINARY)

Preprocessed

My goal: Take all data from this document (first name, last name, date)

Goal example: Result


Solution

  • img = cv2.imread(img_path, 0)
    
    y=53
    x=230
    h=335
    w=380
    
    img = img[y:y+h, x:x+w]
    
    img = cv2.resize(img, (0,0), fx=1.5, fy=1.5) 
    
    img = cv2.GaussianBlur(img, (3, 3), 0)
    _, threshold = cv2.threshold(img, 65, 255, cv2.THRESH_BINARY)
    threshold = cv2.GaussianBlur(threshold, (3, 3), 0)