Search code examples
pythonocrtesseractpython-tesseract

Python recognize digits in simple image with pytesseract


I'm trying to use pytesseract to recognize digital numbers from images as following:

img

i tried following code

text=pytesseract.image_to_string(img, lang='eng',
                config='--psm 13 --oem 3 -c tessedit_char_whitelist=0123456789')
print(text)

it gives me

"ae"

I tried oem=1 and it's still the same.

for your reference my version is as follows:

pytesseract.get_tesseract_version()

LooseVersion ('4.0.0-beta.1')

Any help would be appreciated, including alternative libraries.


Solution

  • this is a known issue - Blacklist and whitelist unsupported with LSTM (4.0)

    basically whitelist and blacklist does not work

    one comment states

    ghost commented on Jul 20, 2018

    Use --oem 0 or -oem 0 and it works

    i have no way to test this ATM but it is worth a try

    4.1 version should have this fixed