I'm processing the images with OpenCV and Python. I need to remove the dots / noise from the image. The Background and the Text,both have the empty Dots/ lines. here is a example: Example Image with this code i am able to remove the background
import cv2
import numpy as np
img = cv2.imread('image.png', 0)
_, blackAndWhite = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY_INV)
nlabels, labels, stats, centroids = cv2.connectedComponentsWithStats(blackAndWhite, None,None, None, 8, cv2.CV_32S)
sizes = stats[1:, -1] #get CC_STAT_AREA component
img2 = np.zeros((labels.shape), np.uint8)
for i in range(0, nlabels - 1):
if sizes[i] >= 50: #filter small dotted regions
img2[labels == i + 1] = 255
res = cv2.bitwise_not(img2)
cv2.imwrite('res.png', res)
result is: Result
how can i fill the empty spaces in the text? or How can i make this image readable for OCR tesseract?
I'm not sure if this will work but you can try a dilate+erode and/or a erode+dilate transform.
This can be implemented with OpenCV: https://docs.opencv.org/4.5.5/d9/d61/tutorial_py_morphological_ops.html
Best Regards