I followed the FAQ to make Tesseract recognize digits, but all I get is a bunch of text in the output file, despite having only numbers in my image.
My command line looks like this:
tesseract --tessdata-dir ./ ./input.jpg ./output/output digits
Any ideas what could be happening?.
As mentioned in tesseract github issue you can't black or whitelist characters with tesseract 4.0 LSTM, instead you should train LSTM with characters you expect on your image.
Thanks to Shreeshrii you can try his 'experimantal' digits traineddata from here
Please note that Tesseract 4.0 is still in alpha stage and if you want - you can still use 3.* versions of tesseract which support your needs from the box. Tesseract v 3.4 tessdata is located here, library for windows can be downloaded from here