I am trying to classify an image based on its content. For example, I have got loads of images as below, that will contain some content – in this case numeric values. I had tried OpenCV and Pytesseract OCR solution as proposed here: https://stackoverflow.com/a/60161328/7250310
However, this solution doesn't work on my images, and the content isn't detected. Below are my sample images:
Do you have any other ideas to achieve this? Basically Image 1 should give output as 1
, and so on.
This simple approach works at least for the four presented images:
import cv2
import pytesseract
images = ['4sXGS.jpg', 'Nizki.jpg', 'T0EM8.jpg', 'g2fY7.jpg']
for img in images:
img = cv2.imread(img, cv2.IMREAD_GRAYSCALE)
img = cv2.threshold(img, 0, 255, cv2.THRESH_OTSU + cv2.THRESH_BINARY_INV)[1]
text = pytesseract.image_to_string(img, config='--psm 10')
text = text.replace('\n', '').replace('\f', '')
The single steps are:
using the -psm 10
option (single character). Maybe also add the described whitelisting for identifying digits only.Caveat: I use a special version of Tesseract from the Mannheim University Library.
System information
Platform: Windows-10-10.0.19041-SP0
Python: 3.9.1
PyCharm: 2021.1.1
OpenCV: 4.5.2
pytesseract: 5.0.0-alpha.20201127