I'm trying to use Tesseract to find text on some images but I'm facing a problem while processing that image:
The text is in portuguese and although it's clearly written Imagem
, Tesseract only gives me ot
.
The command I'm using is tesseract tmp.jpg out --psm 7 -l por
and I have tried varying the --psm
parameter with no luck.
Is there something I'm missing that can improve the recognition?
Tesseract tries to guess the font size based on black pixels in your image, therefore it is preferable to have black text on white background.