Search code examples
pythonopencvocrimage-thresholding

Is there a better way to separate the writing from the background?


I am working on a project where I should apply and OCR on some documents.
The first step is to threshold the image and let only the writing (whiten the background).

Example of an input image: (For the GDPR and privacy reasons, this image is from the Internet)

enter image description here Here is my code:

import cv2
import numpy as np


image = cv2.imread('b.jpg')
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
h = image.shape[0]
w = image.shape[1]
for y in range(0, h):
    for x in range(0, w):
        if image[y, x] >= 120:
            image[y, x] = 255
        else:
            image[y, x] = 0
cv2.imwrite('output.jpg', image)

Here is the result that I got:

enter image description here

When I applied pytesseract to the output image, the results were not satisfying (I know that an OCR is not perfect). Although I tried to adjust the threshold value (in this code it is equal to 120), the result was not as clear as I wanted.

Is there a way to make a better threshold in order to only keep the writing in black and whiten the rest?


Solution

  • You can use adaptive thresholding. From documentation :

    In this, the algorithm calculate the threshold for a small regions of the image. So we get different thresholds for different regions of the same image and it gives us better results for images with varying illumination.

    import numpy as np
    import cv2
    
    
    
    image = cv2.imread('b.jpg')
    image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
    image = cv2.medianBlur(image ,5)
    
    th1 = cv2.adaptiveThreshold(image,255,cv2.ADAPTIVE_THRESH_MEAN_C,\
                cv2.THRESH_BINARY,11,2)
    th2 = cv2.adaptiveThreshold(image,255,cv2.ADAPTIVE_THRESH_GAUSSIAN_C,\
                cv2.THRESH_BINARY,11,2)
    cv2.imwrite('output1.jpg', th1 )
    cv2.imwrite('output2.jpg', th2 )