Search code examples
pythonimageopencvtexttesseract

What is the Best Way to Get Text From Image with Python?


I want to get the text out of an image. I tried tesseract but I had issues installing it, so im wondering if I can get some help with that or another way to do it. When I try to use tesseract it says I have no module names PIL? But I know I have pillow installed and i thought that was in reference to it.


Solution

  • I've provided a Colab solution since that will probably be useful to the most people.

    Install tesseract in our Colab environment.

    !sudo apt install tesseract-ocr
    !pip install PyTesseract
    

    Import libraries and mount our Drive

    from google.colab import drive
    from google.colab.patches import cv2_imshow
    drive.mount('/content/drive/')
    

    Set our pytesseract path, read in source image from our Drive, show our source image, then finally convert the image to text.

    import cv2
    import pytesseract
    import numpy as np
    
    pytesseract.pytesseract.terreract_cmd = {
        r'/usr/bin/tesseract'
    }
    
    src = cv2.imread('/content/drive/MyDrive/Colab Notebooks/images/macbeth.png')
    cv2_imshow(src)
    output_txt = pytesseract.image_to_string(src)
    print(type(output_txt))
    print(output_txt)
    

    enter image description here