Search code examples
pythonopencvocrpython-tesseract

Why can't tesseract extract text that has a black background?


I have attached a very simple text image that I want text from. It is white with a black background. To the naked eye it seems absolutely legible but apparently to tesseract it is a rubbish. I have tried changing the oem and psm parameters but nothing seems to work. Please note that this works for other images but for this one.

It should be extracted as Chand

Please try running it on your machine and see if it works. Or else I might have to change my ocr engine altogether.

Note: It was working earlier until I tried to add black pixels around the image to help the extraction process. Also I don't think that tesseract was trained on black text on a white background. It should be able to do this too. Also if this was true why does it work for other text images that have the same format as this one

Edit: Miraculously I tried running the script again and this time it was able to extract Chand properly but failed in the below mentioned case. Also please look at the parameters I have used. I have read the documentation and I feel this would be the right choice. I have added the image for your reference. It is not about just this image. Why is tesseract failing for such simple use cases?

enter image description here

enter image description here


Solution

  • To find the desired result, you need to know the followings:

    The input images are boldly written, we need to shrink the bold font and then assume the output as a single uniform block of text.

    To shrink the images we could use erosion

    Result will be:

    Erode Result
    enter image description here CHAND
    enter image description here BAKLIWAL

    Code:


    # Load the library
    import cv2
    import pytesseract
    
    # Initialize the list
    img_lst = ["lKpdZ.png", "ZbDao.png"]
    
    # For each image name in the list
    for name in img_lst:
    
        # Load the image
        img = cv2.imread(name)
    
        # Convert to gry-scale
        gry = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    
        # Erode the image
        erd = cv2.erode(gry, None, iterations=2)
    
        # OCR with assuming the image as a single uniform block of text
        txt = pytesseract.image_to_string(erd, config="--psm 6")
        print(txt)