python opencv ocr easyocr image-enhancement

How to preprocess image with low contrast to improve quality of the ocr and avoid information loss?

I am trying to do OCR text detection with an image that has low contrast.

Raw:

Raw

I am currently using this approach: use these filters for the preprocess:

img_yuv = cv2.cvtColor(img, cv2.COLOR_BGR2YUV)
img_yuv[:,:,0] = cv2.equalizeHist(img_yuv[:,:,0])
img_output = cv2.cvtColor(img_yuv, cv2.COLOR_YUV2BGR)
img = img.filter(ImageFilter.GaussianBlur) 
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
thresh, im_bw = cv2.threshold(img, 210, 230, cv2.THRESH_BINARY)
data = Image.fromarray(im_bw)

After pre-processing, this is what I get

Results

How can I improve my approach?

Solution

Your input image is heavily noised with salt-pepper like noise (see) therefore you should use median blurring instead of gaussian.Also it has varying light levels throughout the image therefore you should use adaptive thresholding to minimize the effect of varying light. Furthermore, because image is heavily noised when you apply histogram equalization it also amplifies the noise.

I divided your image into 3 horizontal pieces and filter them with Otsu Thresholding then combined again. Let me show you the effect of Gaussian Blur, Median Blur and Histogram Equalization with filtered images.

This is original image:

This is Gaussian Blurred and Thresholded:

This is Median Blurred and Thresholded:

As you can see Median Blur is more suited to your image because it has salt-peper like noise throughout the whole image.

Now lets look at the effect of equalization on a hevily noised image. This is the YUV equalized image:

This is the Gaussian Blurred version:

This is the Median Blurred version:

So you can achieve the best result using the original image without the equalization and with median blurring. You can also use morphological operations to further improve the image.

This is the code i used:

import cv2
import numpy

def thresholder(img,fil):

    h,w,c = img.shape
    gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
    
    if fil == 'median':
        blurred = cv2.medianBlur(gray,11)
    elif fil == 'gauss':
        blurred = cv2.GaussianBlur(gray,(11,11),0)

    #Divide the image into 3 pieces and filter them seperately
    p1,p2 = int(w/3),int(2*w/3)
    _,thresh1 = cv2.threshold(blurred[0:h,0:p1],0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
    _,thresh2 = cv2.threshold(blurred[0:h,p1:p2],0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)
    _,thresh3 = cv2.threshold(blurred[0:h,p2:w],0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

    #Merge the seperately filtered images
    thresh = cv2.hconcat([thresh1,thresh2,thresh3])

    return thresh


img = cv2.imread('img/plate.png')
h,w,c = img.shape

img_yuv = cv2.cvtColor(img, cv2.COLOR_BGR2YUV)
img_yuv[:,:,0] = cv2.equalizeHist(img_yuv[:,:,0])
img_equalized = cv2.cvtColor(img_yuv, cv2.COLOR_YUV2BGR)


cv2.imshow('Original Image',img)
cv2.imshow('Equalized Image',img_equalized)

cv2.imshow('Original Median Blurred',thresholder(img,fil='median'))
cv2.imshow('Original Gaussin Blurred',thresholder(img,fil='gauss'))

cv2.imshow('Equalized Median Blurred',thresholder(img_equalized,fil='median'))
cv2.imshow('Equalized Gaussin Blurred',thresholder(img_equalized,fil='gauss'))

cv2.waitKey(0)