Search code examples
pythonocrpaddle-paddlepaddleocr

Not able to import paddleocr library on Google Colab


I am not able to import paddleocr library on Google Colab after the install of paddlepaddle and paddleocr successfully. It hits the error as shown below:

**from paddleocr import PaddleOCR, draw_ocr**



Error: Can not import paddle core while this file exists: /usr/local/lib/python3.10/dist-packages/paddle/fluid/libpaddle.so
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
<ipython-input-7-e16532d3d475> in <cell line: 1>()
----> 1 from paddleocr import PaddleOCR, draw_ocr
      2 import os
      3 import cv2
      4 import matplotlib.pyplot as plt
      5 get_ipython().run_line_magic('matplotlib', 'inline')

8 frames
/usr/local/lib/python3.10/dist-packages/paddle/fluid/core.py in <module>
    267 
    268 try:
--> 269     from . import libpaddle
    270 
    271     if avx_supported() and not libpaddle.is_compiled_with_avx():

ImportError: libssl.so.1.1: cannot open shared object file: No such file or directory

Previously, everything was okay but I am not able to import paddleOCR today. Appreciate the community help on this issue.


Solution

  • Using the following steps, I was able to get PaddleOCR to run in Google Colab:

    1. Go the the "Runtime" tab, select "Change runtime type" and under "Hardware accelerator" select "GPU".

    2. Upload the image to be analyzed in the content section of Google Colab.

    3. Download the PaddleOCR modules using pip:

    !pip install paddlepaddle-gpu
    !pip install paddleocr
    
    1. Download libssl1.1 using Bash:
    !wget http://archive.ubuntu.com/ubuntu/pool/main/o/openssl/libssl1.1_1.1.0g-2ubuntu4_amd64.deb
    !sudo dpkg -i libssl1.1_1.1.0g-2ubuntu4_amd64.deb
    
    1. Clone the PaddleOCR repo from GitHub to use the fonts:

    !git clone https://github.com/PaddlePaddle/PaddleOCR

    1. Run a codeblock to test that PaddleOCR is working. The codeblock is from the project description section on PaddleOCR's pypi page:
    from paddleocr import PaddleOCR, draw_ocr
    from PIL import Image
    from IPython import display
    
    img_path = '/content/test_img.PNG'
    ocr = PaddleOCR(lang='en')
    result = ocr.ocr(img_path,rec=False)
    for idx in range(len(result)):
        res = result[idx]
        for line in res:
            print(line)
    
    result = result[0]
    image = Image.open(img_path).convert('RGB')
    im_show = draw_ocr(image, result, txts=None, scores=None, font_path='/content/PaddleOCR/StyleText/fonts/ko_standard.ttf')
    im_show = Image.fromarray(im_show)
    im_show.save('result.jpg')
    display.Image('result.jpg')
    

    The output is below: OCR_image