I was given a school project for recognizing various kinds of CAPTCHA, and I had some difficulties with its implementation.
Images of this type will be fed into input ,,.
I handle them with the following code:
import cv2
import pytesseract
# load image
fname = 'picture.png'
im = cv2.imread(fname,cv2.COLOR_RGB2GRAY)
pytesseract.pytesseract.tesseract_cmd = r'C:\Tesseract-OCR\tesseract.exe'
im = im[0:90, 35:150]
im = cv2.blur(im,(3,3))
im = cv2.threshold(im, 223 , 250, cv2.THRESH_BINARY)
im = im[1]
cv2.imshow('',im)
cv2.waitKey(0)
After all processing, the image looks like this: And at this point, I have a problem, how can I modify the image to good readability by the computer, so that instead of the wrong TAREQ.
he would display the 7TXB6Q
I am trying to display text from an image with the pytesseract
library as follows
data = pytesseract.image_to_string(im, lang='eng', config='--psm 6 --oem 3 -c tessedit_char_whitelist= ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789')
print(data)
I am writing here hoping to get valuable advice (perhaps you know the most suitable way to get text from a picture or process the image pinned above). Peace for everyone)
More images
You can try finding countours and eliminating those which have small areas. This preprocessing operation should increase the success of OCR result.
import cv2 as cv
import numpy as np
# your thresholded image im
bw = cv.imread('bw.png', cv.IMREAD_GRAYSCALE)
_, cnts, _ = cv.findContours(bw, cv.RETR_TREE, cv.CHAIN_APPROX_SIMPLE)
# remove the largest contour which is background
cnts = np.array(cnts[1:], dtype=object)
areas = np.array(list(map(cv.contourArea, cnts)))
thr = 35
thr_cnts = cnts[areas > thr]
disp_img = 255 * np.ones(bw.shape, dtype=np.uint8)
disp_img = cv.drawContours(disp_img, thr_cnts, -1, (0, 0, 0), cv.FILLED)
disp_img = cv.bitwise_or(disp_img, bw)
cv.imshow('result', disp_img)
cv.waitKey()
cv.destroyAllWindows()
cv.imwrite('result.png', disp_img)
Edit: It seems that merging the two codes did not give the same result. This is the full code from the beginning to the end.
import cv2 as cv
import numpy as np
# load image
fname = 'im.png'
im = cv.imread(fname, cv.IMREAD_GRAYSCALE)
# crop
im = im[0:90, 35:150]
# blurring is essential for denoising
im = cv.blur(im, (3,3))
thr = 219
# the binary threshold value is very important
# using 220 instead of 219 causes loss of a letter
# because it touches to the bottom edge and gets involved in the background
_, im = cv.threshold(im, thr, 255, cv.THRESH_BINARY)
cv.imshow('', im)
cv.waitKey(0)
# binary image
bw = np.copy(im)
# find contours and corresponding areas
_, cnts, _ = cv.findContours(bw, cv.RETR_LIST, cv.CHAIN_APPROX_NONE)
cnts = np.array(cnts, dtype=object)
areas = np.array(list(map(cv.contourArea, cnts)))
thr = 35
# eliminate contours that are smaller than threshold
# also remove the largest contour which is background
thr_cnts = cnts[np.logical_and(areas > thr, areas != np.max(areas))]
# draw the remaining contours
disp_img = 255 * np.ones(bw.shape, dtype=np.uint8)
disp_img = cv.drawContours(disp_img, thr_cnts, -1, (0, 0, 0), cv.FILLED)
disp_img = cv.bitwise_or(disp_img, bw)
cv.imshow('', disp_img)
cv.waitKey()
cv.destroyAllWindows()