I have these images
I want to remove noise from these images so I can convert them into text using pytesseract. The noise is only in blue colour so I tried to remove blue from the image. Still not good results.
This is what I did
import cv2
import pytesseract
# Extract the blue channel
blue = img[:, :, 0]
# Apply thresholding to the blue channel
thresh = cv2.threshold(blue, 0, 255, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)[1]
# Perform morphological operations to remove noise
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (1,1))
opening = cv2.morphologyEx(thresh, cv2.MORPH_OPEN, kernel, iterations=7)
# Apply blur to smooth out the image
blur = opening#cv2.medianBlur(opening, 1)
cv2.imwrite("/Users/arjunmalik/Desktop/blur.png",blur)
display("/Users/arjunmalik/Desktop/blur.png")
The result was
The OCR results were FL1S4y.
As stated by Sembei, You need to use a closing operator which's a must for a situation like this because you want to close black points on the object to improve the image quality.
Solution:
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (4,4))
closing = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel, iterations=1)
You can modify your code to this one to achieve the following output for the second image.
Output:
You might need to change the size of the kernel for different input images.
Thoughts:
I think it'd be better if you do the character segmentation first before applying the closing operator in order to achieve the finest results.