Search code examples
ocrtesseract

Tesseract does not recognize clear text


I'm trying to use Tesseract to find text on some images but I'm facing a problem while processing that image:

Imagem

The text is in portuguese and although it's clearly written Imagem, Tesseract only gives me ot.

The command I'm using is tesseract tmp.jpg out --psm 7 -l por and I have tried varying the --psm parameter with no luck.

Is there something I'm missing that can improve the recognition?


Solution

  • Tesseract tries to guess the font size based on black pixels in your image, therefore it is preferable to have black text on white background.