I am using windows os. Want to detect the text from an image with fontAttributes by tesserocr. But when I am running the python code, I am getting this error - RuntimeError: Failed to init API, possibly an invalid tessdata path: C:\Program Files (x86)\Tesseract-OCR\tessdata/
i) I have installed -
tesseract-ocr-w32-setup-v5.0.0-alpha.20190623.exe
//(though my system is 64 bit)
ii) Added to path variable (both system and user path) -
C:\Program Files (x86)\Tesseract-OCR
C:\Program Files (x86)\Tesseract-OCR\tessdata
iii) Created new system path variable - TESSDATA_PREFIX and linked path of
tessdata folder, like -
TESSDATA_PREFIX - C:\Program Files (x86)\Tesseract-OCR\tessdata
import pytesseract
import locale
locale.setlocale(locale.LC_ALL, 'C')
from tesserocr import PyTessBaseAPI, RIL, iterate_level,OEM
with PyTessBaseAPI(oem=OEM.TESSERACT_ONLY,lang='bask') as api:
api.SetImageFile('sugar.png')
api.Recognize()
ri = api.GetIterator()
level = RIL.WORD
for r in iterate_level(ri, level):
attrs = r.WordFontAttributes()
symbol = r.GetUTF8Text(level)
print(symbol,attrs)
with PyTessBaseAPI(oem=OEM.TESSERACT_ONLY,lang='bask') as api:
File "tesserocr.pyx", line 1168, in tesserocr._tesserocr.PyTessBaseAPI.__cinit
__
File "tesserocr.pyx", line 1181, in tesserocr._tesserocr.PyTessBaseAPI._init_a
pi
RuntimeError: Failed to init API, possibly an invalid tessdata path: C:\Program
Files (x86)\Tesseract-OCR\tessdata/
Probably you don't have the .traineddata files in your system. you have to copy it from
C:\Program Files\Tesseract-OCR\tessdata
and paste all the data files to your directory , I'd suggest create a virtual environment ans then use it