Im trying to get tesseract to only recognize numbers but no matter what I put for configuration, it ignores it. pytessseract is in version 0.2.0 and tesseract in 4.00.00alpha
from PIL import Image
import pytesseract as tes
import glob
tes.pytesseract.tesseract_cmd = 'C:/Program Files (x86)/Tesseract-OCR/tesseract'
a = glob.glob(r'C:\Users\Pascal\Desktop\visible\*.png')
for imgPath in a:
casd = Image.open(imgPath).convert('L').point(lambda x: 0 if x < 200 else 255, '1')
im = tes.image_to_string(casd, config='outputbase digits')
print(im)
Some outputs:
® a 69 ® 0
® a 69 ® 0
® ase ® 0
® aso ® 0
The feature that digits
config file relies on is broken in Tesseract 4.0x.