Search code examples
ocrpython-tesseract

Pytesseract increase Textdetection


i would like to have the entries from the following vehicle registration document automatically written to a text file.

However, the text recognition is very difficult. I have tried to open the image in different configurations. I have also tested different colour levels of the vehicle registration document. However, none of my attempts yielded a usable result.

Does anyone have an idea how it would be possible to recognise the text properly?

This is the image i tried to ocr:

enter image description here

The Code i used is shown in the Following:

import cv2
import numpy as np
import pytesseract
import matplotlib.pyplot as plt
from PIL import Image
import regex

pytesseract.pytesseract.tesseract_cmd=r'C:\Program Files\Tesseract-OCR\tesseract.exe'

img = cv2.imread("Fahrzeugscheinsplit1.jpg")

result = pytesseract.image_to_string(img)
print(result)

My output is shown in here:

|
08.05.2006)'| 8566) ADVOOOO1X
ne r pear
a BORD 7 aoe \
‘BWY i
QUBB1 Repieee ay a f
TRAC |
| = say, |
is Mondeo ath }
FO! s 1
Fz.2.Pers, +b. 8 Spl. .
Kombilimousine
vo) EURO 4
«| BURO 4 ) Re !
» Diesel ES
ll 0002. WW 0d62. l2198 |

Solution

  • First, you should know the image-processing techniques for tesseract. From the official documentation you can apply simple-threshold.

    If you apply simple thresholding, the result will be:

    enter image description here

    I think we should center the image for accurate recognition. We can center the image by adding borders:

    enter image description here

    The image is ready for text-extraction, if we process the image with the confidence > 30:

    enter image description here

    Nearly all the text in the given input image is detected. We can also print the values of the detected texts:

    Detected Text: 08.05.2006
    Detected Text: 8566!
    Detected Text: M1
    Detected Text: AC
    Detected Text: 8
    Detected Text: 6
    Detected Text: FORD
    Detected Text: BWY
    Detected Text: SFHAP7
    Detected Text: Mondeo
    Detected Text: FORD
    Detected Text: (D)
    Detected Text: Pz.z.Pers.bef.b.
    Detected Text: 8
    Detected Text: Spl.
    Detected Text: Kombilimousine
    Detected Text: EURO
    Detected Text: 4
    Detected Text: EURO
    Detected Text: 4
    Detected Text: Diesel
    Detected Text: 0002
    Detected Text: 0462
    Detected Text: 2198
    

    Using simple thresholding we nearly found all the values correctly, for the missing parts you can play with the values like decreasing the confidence level or increasing the thresh level or using other threshold methods like adaptive-thresholding or inRange-thresholding

    Code:

    from cv2 import imread, cvtColor, COLOR_BGR2GRAY as GRAY
    from cv2 import imshow, waitKey, rectangle, threshold, THRESH_BINARY as BINARY
    from cv2 import copyMakeBorder as addBorder, BORDER_CONSTANT as CONSTANT
    from pytesseract import image_to_data, Output
    
    
    bgr = imread("UXvS7.jpg")
    gray = cvtColor(bgr, GRAY)
    border = addBorder(gray, 50, 50, 50, 50, CONSTANT, value=255)
    thresh = threshold(border, 150, 255, BINARY)[1]
    data = image_to_data(thresh, output_type=Output.DICT)
    
    for i in range(0, len(data["text"])):
        confidence = int(data["conf"][i])
        if confidence > 30:
            x = data["left"][i]
            y = data["top"][i]
            w = data["width"][i]
            h = data["height"][i]
            text = data["text"][i]
            print(f"Detected Text: {text}")
            rectangle(thresh, (x, y), (x + w, y + h), (0, 255, 0), 2)
    
    imshow("", thresh)
    waitKey(0)