I am working on a project that requires character recognition as a part of it. I am using a handwriting dataset by IAM, so all the images are more or less taken in the same conditions. I am using pictures of words that have been provided by the dataset and following these steps
What I'm trying to achieve is to store characters of a person's document in folders categorized by the alphabet and maybe form a template from them later on. For this I need to know which character it is.
Here's what I get as a result -
All the characters are properly segmented (for most cases). This is more of a tesseract question than it is a python question, but I'm using python to write the script and calling tesseract through the pytesseract wrapper.
I'm using OpenCV to manipulate the image. Images of these letter matrices are sent as input to tesseract (handled by pytesseract). The input is not an issue, I assure you. Is there anything else I need to do for tesseract to work?
None of these characters are recognized.
Tesseract doesn't support handwritten text well. You should try either ABBYY OCR for that or alternative free libraries like Lipi Toolkit.