I have an image that looks like this:
And this is the processed image
I have tried pretty much everything. I processed the image like this:
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #Converting to GrayScale
(h, w) = gray.shape[:2]
gray = cv2.resize(gray, (w*2, h*2))
thresh = cv2.threshold(gray, 150, 255.0, cv2.THRESH_BINARY | cv2.THRESH_OTSU)[1]
gray = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, rectKernel)
blur = cv2.GaussianBlur(gray,(1,1),cv2.BORDER_DEFAULT)
text = pytesseract.image_to_string(blur, config="--oem 1 --psm 6")
But Tesseract doesnt print out anything. I am using this version of tesseract 5.0.0-alpha.20201127
How do I improve it's performance? Its highly unreliable. Edit:
The answer below did a wonderful job on the said image.
But when I apply this technique to image like this one I get wrong output
Why is that? They seem roughly the same.
The problem is characters are not in center of the image.
Sometimes, tesseract have difficulty recognizing the characters or digit if they are not on the center.
Therefore my suggestion is:
Centering the characters:
cv2.copyMakeBorder(img, 50, 50, 50, 50, cv2.BORDER_CONSTANT, value=[255])
50
is just a padding variable, you can set to any other value.
The background turns blue because of the value. OpenCV read the image in BGR fashion. giving 255
as an input is same as [255, 0, 0]
which is display blue channel, but not green and red respectively.
You can try with other values. For me it won't matter, since I'll convert it to gray-scale on the next step.
Now when you read:
MEHVISH MUQADDAS
Code:
import cv2
import pytesseract
# Load the image
img = cv2.imread("onf0D.jpg")
# Center the image
img = cv2.copyMakeBorder(img, 50, 50, 50, 50, cv2.BORDER_CONSTANT, value=[255])
# Up-sample
img = cv2.resize(img, (0, 0), fx=2, fy=2)
# Convert to gray-scale
gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# OCR
txt = pytesseract.image_to_string(gry, config="--psm 6")
print(txt)
Read more tesseract-improve-quality.
You don't need to do threshold
, GaussianBlur
or morphologyEx
.
The reasons are:
Simple-Threshold is used to get the features of the image. Input images' features are already available.
You don't have to smooth the image, there is no illumination effect on the image.
You don't need to do segmentation, since background is plain-white.
Update-1
The second image requires pre-processing. However, applying simple-threshold won't work on this image. You need to remove the background using a binary mask, then you can apply OCR.
Now, if you apply OCR:
IRUM FEROZ
Code:
import cv2
import numpy as np
import pytesseract
# Load the image
img = cv2.imread("jCMft.jpg")
# Center the image
img = cv2.copyMakeBorder(img, 50, 50, 50, 50, cv2.BORDER_CONSTANT, value=[255])
# Up-sample
img = cv2.resize(img, (0, 0), fx=2, fy=2)
# Convert to HSV color-space
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
# Adaptive-Threshold
msk = cv2.inRange(hsv, np.array([0, 0, 0]), np.array([179, 255, 130]))
# OCR
txt = pytesseract.image_to_string(msk, config="--psm 6")
print(txt)
Q:How do I find the lower and upper bounds of the cv2.inRange
method?
A: You can use the following script.
Q: What did you change in the second image?
A: First I converted image to the HSV format, instead of gray-scale. The reason is I wanted remove the background. If you experiment with adaptiveThreshold
you will see there are a lot of artifacts on the background limits the tesseract recognition. Then I used cv2.inRange
to get a binary mask. Feeding binary-mask to the input gave me the desired result.