Search code examples
pythonpython-3.xocrtesseractpython-tesseract

Pytesseract: FileNotFound


I have been having some problems with Pytesser using this code to test it:

from PIL import Image
import pytesseract

img = Image.open('pic.png')
img.load()
text = pytesseract.image_to_string(img)
print(text)

Run on Python 3.4 in Windows

When run I get the error originating from the Pytesseract module:

Traceback (most recent call last):
   File "C:/Users/Gamer/Documents/Python/Bot/test.py", line 6, in <module>
      text = pytesseract.image_to_string(img)
   File "C:\Python34\lib\site-packages\pytesseract\pytesseract.py", line 122, in image_to_string
      config=config)
File "C:\Python34\lib\site-packages\pytesseract\pytesseract.py", line 46, in run_tesseract
   proc = subprocess.Popen(command, stderr=subprocess.PIPE)
File "C:\Python34\lib\subprocess.py", line 859, in __init__
   restore_signals, start_new_session)
File "C:\Python34\lib\subprocess.py", line 1114, in _execute_child
   startupinfo)
FileNotFoundError: [WinError 2] The system cannot find the file specified

I am new to installing modules and this may have originated from a bad install or setup of Tesseract-OCR or the module.

Any help will be greatly appreciated,

-Niall


Solution

  • I didn't have any issues with installing tesseract but I leveraged the Tesseract at UB Mannheim installer:

    https://github.com/UB-Mannheim/tesseract/wiki

    You will also need to install pytesseract:

    pip3.6 install pytesseract

    It appears that Python is having an issue finding the location of the image. I would recommend using a variable set with the path to the image to rule out any PATH related issues. Here is an example:

    #Path to image folder    
    src_path = "C:\\Users\\USERNAME\\Documents\\OCR\\"
    
    #Run OCR on image    
    text = pytesseract.image_to_string(Image.open(src_path + "pic.png"))
    
    #Print OCR result
    print (text)