Search code examples
pythonopencvimage-processingtesseract

How to remove curly line on image


I use tesseract ocr to recognize the text on the image. I have a problem with images that have curly lines. I tried various method such as threshold, gaussian filter, extract by color.. but I cannot remove it. I want to delete the lines without loosing the numbers

curly line on image

This is the image using the threshold method

image use theshold method

I’m using opencv for image processing and tesseract 4.0 for recoginize the text

Any hint or some sense of direction will be much appreciated. Thanks in advance for your help.


Solution

  • I tried multiple approaches. The following one is the closest I could get to.

    enter image description here

    Simple Algorithm:

    1. Obtain green channel of the image
    2. Apply Gaussian blur of kernel size (3x3)
    3. Apply histogram equalization
    4. Find a suitable threshold to get the desired result

    This is just a starter. You can get far better result if you incorporate adaptive threshold techniques and morphological operations.

    (I have the code available in case you need it)