Search code examples
pythonopencvimage-processingocrimage-thresholding

Proper image thresholding to prepare it for OCR in python using opencv


I am really new to opencv and a beginner to python.

I have this image:

original bmp 24bit image

I want to somehow apply proper thresholding to keep nothing but the 6 digits.

The bigger picture is that I intend to try to perform manual OCR to the image for each digit separately, using the k-nearest neighbours algorithm on a per digit level (kNearest.findNearest)

The problem is that I cannot clean up the digits sufficiently, especially the '7' digit which has this blue-ish watermark passing through it.

The steps I have tried so far are the following:

I am reading the image from disk

# IMREAD_UNCHANGED is -1
image = cv2.imread(sys.argv[1], cv2.IMREAD_UNCHANGED)

Then I'm keeping only the blue channel to get rid of the blue watermark around digit '7', effectively converting it to a single channel image

image = image[:,:,0] 
# openned with -1 which means as is, 
# so the blue channel is the first in BGR

single channel - red only - image

Then I'm multiplying it a bit to increase contrast between the digits and the background:

image = cv2.multiply(image, 1.5)

multiplied image to increase contrast

Finally I perform Binary+Otsu thresholding:

_,thressed1 = cv2.threshold(image,0,255,cv2.THRESH_BINARY+cv2.THRESH_OTSU)

binary Oahu thresholded image

As you can see the end result is pretty good except for the digit '7' which has kept a lot of noise.

How to improve the end result? Please supply the image example result where possible, it is better to understand than just code snippets alone.


Solution

  • You can try to medianBlur the gray(blur) image with different kernels(such as 3, 51), divide the blured results, and threshold it. Something like this:

    enter image description here


    #!/usr/bin/python3
    # 2018/09/23 17:29 (CST) 
    # (中秋节快乐)
    # (Happy Mid-Autumn Festival)
    
    import cv2 
    import numpy as np 
    
    fname = "color.png"
    bgray = cv2.imread(fname)[...,0]
    
    blured1 = cv2.medianBlur(bgray,3)
    blured2 = cv2.medianBlur(bgray,51)
    divided = np.ma.divide(blured1, blured2).data
    normed = np.uint8(255*divided/divided.max())
    th, threshed = cv2.threshold(normed, 100, 255, cv2.THRESH_OTSU)
    
    dst = np.vstack((bgray, blured1, blured2, normed, threshed)) 
    cv2.imwrite("dst.png", dst)
    

    The result:

    enter image description here