Hi all I have a problem with the OCR Tesseract
for C# (tessnet2) it find the caractère IVI and not "M" can you help me?
tessnet2.Tesseract ocr = new tessnet2.Tesseract();
ocr.SetVariable("tessedit_char_whitelist", "ABCDEFGHIJKLMNOPQRSTUVWXYZ"); // If digit only
ocr.Init(@"C:\tresnet", "fra", false); // To use correct tessdata
List<tessnet2.Word> result = ocr.DoOCR(imgSortie, Rectangle.Empty);
String ListeLettres= "";
foreach (tessnet2.Word word in result)
ListeLettres= ListeLettres + word.Text;
@user2094482 Hi,
I was engaged with character recognition with Tesseract and c++. Once i faced the same problem. My system recognized |v| instead of M even the image was clear for my naked eye. I tried several image pre processing concepts such as image binarisation, image blur and etc to get accurate results. But none of those methods gave 100% accurate results for me. Therefore i tried white listing and it was a success.
text = readLettersFromTesseractOCR(img_bw,&error,CharacterSequence);
CharacterSequence was initialized as below.
CharacterSequence = ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789<
Hope this will work with your system as well.