I'm trying to perform OCR using Tesseract (version 3.04.00).
All my images have the same pattern (digit dot digit digit, ie. a decimal with 2 digits precision). I tried using the --user-patterns
option, but I can't have it to work.
What I did:
patterns.txt
with \d.\d\d
on first line--user-patterns patterns.txt
But I get the following error:
pytesseract.pytesseract.TesseractError: (1, "Tesseract Open Source OCR Engine v3.04.00 with Leptonica read_params_file: Can't open 1 read_params_file: Can't open user-patterns read_params_file: parameter not found: \\d.\\d\\d")
How can I specify my pattern to Tesseract ? Is this even the right approach ? Thanks in advance for help or advices, I don't find much doc on Tesseract.
EDIT: add Python code
img = cv2.imread("path/to/image", cv2.IMREAD_GRAYSCALE)
text = pytesseract.image_to_string(img, config="-psm 7 --user-patterns patterns.txt")
print(text)
Nevermind, I think Tesseract was overkill for my usecase.
I took an image of each digit from 0 to 9, and picked the minimum mean square error with the image I want to predict. Got 100% accuracy on my test dataset.