python google-colaboratory ocr python-tesseract

KeyError: 'PNG' while using pytesseract.image_to_data

I tried to put boxes in an image file around the texts in it, using pytesseract function image_to_data, but encounters the following error on colab:

    KeyError                                  Traceback (most recent call last)
<ipython-input-10-a92a28892aac> in <module>()
      6 img = cv2.imread("a.jpg")
      7 
----> 8 d = pytesseract.image_to_data(img, output_type=Output.DICT)
      9 print(d.keys())

5 frames
/usr/local/lib/python3.7/dist-packages/PIL/Image.py in save(self, fp, format, **params)
   2121         expand=0,
   2122         center=None,
-> 2123         translate=None,
   2124         fillcolor=None,
   2125     ):

KeyError: 'PNG'

The code I am using is:

import cv2
import pytesseract
from pytesseract import Output
from PIL import Image

img = cv2.imread("a.jpg")

d = pytesseract.image_to_data(img, output_type=Output.DICT)
print(d.keys())

Thinking that may be that image_to_data can only work with PNG and not jpeg (which is odd), I added few lines to change the jpeg to png before feeding to image_to_data, but the same error persists. Why is this happening?

Solution

In colab, install tesseract by below code

!sudo apt-get install tesseract-ocr

Next install pytesseract library and restart the runtime.

!pip install pytesseract==0.3.9

Make sure you restart the runtime, after restarting the runtime, execute the code. Upload the image in workspace or get the path from drive and make use of it instead of the path declared here.

import pytesseract
from pytesseract import Output
from PIL import Image
import cv2
imag = cv2.imread('/content/1.jpg')

d = pytesseract.image_to_data(imag, output_type=Output.DICT)
print(d.keys())

This will help you. For reference colab notebook

The output seems like,

dict_keys(['level', 'page_num', 'block_num', 'par_num', 'line_num', 'word_num', 'left', 'top', 'width', 'height', 'conf', 'text'])