I tried to extract the number from the attached image
[
But I am not getting the number 8 as an output. I tried with different PSM values as well like 6, 10 etc.
This is what I have so far:
image = cv2.imread(image_path)
if(image is not None):
# Convert the image to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
# Use OCR to extract text from the image
extracted_text = pytesseract.image_to_string(gray, config='--psm 10 -c tessedit_char_whitelist=0123456789')
Even though the image looks good for OCR, there is some shading from the vertical line that is detrimental to the detection. I did some thresholds and eventually got this image:
I feed this to the tesseract and I get the "8":
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd = "C:/Program Files/Tesseract-OCR/tesseract.exe"
im = cv2.imread("8.png") # read
b,g,r = cv2.split(im) # split
mask = (b>200)*(r<200)*(g<200) # threshold
text = pytesseract.image_to_string(mask, config='-l eng --psm 10') # use
print(text) # print, results is "8"
Of course, this will fail if other colours are involved. If you have this situation, can you post more images so I can adjust the code?