Search code examples
pythontesseract

pytesseract not working in python eventhough path is set


I am trying to install a package Pytesseract-OCR in anaconda prompt and I am facing the following issue. Below are the steps I followed

pip install pillow
pip install pytesseract

Then I downloaded pytesseract-ocr from https://github.com/UB-Mannheim/tesseract/wiki and installed the same to F: directory

When I tried to run the python code in my Jupyter notebook as below:

from PIL import Image
import pytesseract
pytesseract.pytesseract.tesseract_cmd = 'F:\Tesseract-OCR\tesseract.exe'
im=Image.open(r"G:\Downloads From Chrome\myimg.jpg")
result=pytesseract.image_to_string(im)

It is throwing the below error

FileNotFoundError: [WinError 2] The system cannot find the file specified

Please help me to resolve the issue Thank you in advance

Thank you for solving the above issue I tried but it has resulted in another issue

(1, 'Error opening data file F:\\Tesseract-OCR\\tessdata/english.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory. Failed loading language \'english\' Tesseract couldn\'t load any languages! Could not initialize tesseract.')

I have also set the path F:\Tesseract-OCR\tessdata in my system environment variable as TESSDATA_PREFIX and restarted as well but even then it is not working. I have all english file in my directory mentioned above


Solution

  • You'll need to also use raw r"" strings with the path.

    pytesseract.pytesseract.tesseract_cmd = r'F:\Tesseract-OCR\tesseract.exe'
    

    Otherwise, that \t there is interpreted as a tab character, and F:\Tesseract-OCR esseract.exe surely doesn't exist :)