Search code examples
pythonflaskpyinstallerpython-tesseract

Getting Pytesseract Error while creating .exe file using pyinstaller


So basically I am trying to create a simple flask app where we can use pytesseract to do OCR on image and return the data in string. And also i am packaging the whole app into the .exe file using the pyinstaller after doing the obfuscation of the python files using pyarmor.

I have also copied the pytesseract folder and pasted it next to the files to add it in run.spec file during the .exe creation as I need to bundle this dependency with the .exe file. I am getting the following error while the execution of the .exe file

pytesseract.pytesseract.TesseractError: (1, 'Error opening data file C:\\Users\\Akash\\AppData\\Local\\Temp\\_MEI87082\\Tesseract-OCR\\tessdata\\e13b.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'e13b\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')

To do the solution of this i have added the following line to set environment variable:

os.environ['TESSDATA_PREFIX']='Tesseract-OCR/tessdata/'

and have also tried the solution of adding the tessdata attributed to image_to_string() function's config attribute as follows:

tessdata_dir_config = r'--tessdata-dir "Tesseract-OCR/tessdata/"'
content = pytesseract.image_to_string(image, lang='e13b', config=tessdata_dir_config)
print(content)

But still the .exe file is providing the same error.

And also to resolve the path problems I have used the following function to set the absolute path of files.

def resource_path(relative_path):
""" Get absolute path to resource, works for dev and for PyInstaller """
import os,sys
try:
    # PyInstaller creates a temp folder and stores path in _MEIPASS
    base_path = sys._MEIPASS
except Exception:
            try:
                    base_path = sys._MEIPASS2
            except Exception:
                    base_path = os.path.abspath(".")
#print("base_path",base_path)
#print("relative_path",relative_path)
return os.path.join(base_path, relative_path)

I hope that this information is enough to answer the question and if you need more information just ask it and will respond to it.

Thanks in advance.


Solution

  • So Later on when i check my local/temp folder where the .exe file was extracting the whole files, it came to realize that after the extract it do not have any \tessdata folder of which we were giving the path and the e13b.traindata was direclty extracted inside the "Tesseract-OCR" folder.

    So in app.py gave the path of

    tessdata_dir_config = r'--tessdata-dir "Tesseract-OCR/tessdata/"'
    

    To

    tessdata_dir_config = r'--tessdata-dir "Tesseract-OCR"'
    

    and finally this resolved the issue.

    But again stuck into another error ... well that's another story.