Search code examples
pythontesseractpython-imaging-library

Python Tesseract "No such file or directory"


I'm trying to make an OCR program in python, and I'm using pillow to make an image high contrast black and white, but when I try to use tesseract to extract the text, I get the following error output in terminal:

Error

Traceback (most recent call last):
  File "OCR.py", line 41, in <module>
    print(pytesseract.image_to_string(img))
  File                 "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/pytesseract/pytesseract.py", line 122, in image_to_string
config=config)
  File     "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-    packages/pytesseract/pytesseract.py", line 46, in run_tesseract
    proc = subprocess.Popen(command, stderr=subprocess.PIPE)
  File     "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess    .py", line 707, in __init__
    restore_signals, start_new_session)
  File     "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/subprocess    .py", line 1333, in _execute_child
    raise child_exception_type(errno_num, err_msg)
FileNotFoundError: [Errno 2] No such file or directory:     '/usr/local/bin/tesseract'

Python

from PIL import Image
import numpy as np
import pytesseract

sens = int(input("Sensitivity (0-255): "))

im = Image.open("book.jpg")
pixels = np.asarray(im)
width, height = im.size

px = pixels.mean(axis=2)
ppx = px.flatten()


for i in range(ppx.size):
    if ppx[i] > sens:
        ppx[i] = 255
    else:
        ppx[i] = 0


pixels = ppx.reshape(height, width)

img = Image.fromarray(np.uint8(pixels))
img.show()
img.save("images2.jpg")

print(pytesseract.image_to_string(img))

Solution

  • According to the README, you must install tesseract to use pytesseract.

    On Ubuntu:

    sudo apt install tesseract-ocr