Search code examples
pythonopencvcomputer-visionocrimage-preprocessing

bounding boxes on handwritten digits with opencv


I tried the code provided bellow to segment each digit in this image and put a contour around it then crop it out but it's giving me bad results, I'm not sure what I need to change or work on.

The best idea I can think of right now is filtering the 4 largest contours in the image except the image contour itself.

The code I'm working with:

import sys
import numpy as np
import cv2

im = cv2.imread('marks/mark28.png')
im3 = im.copy()

gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (5, 5), 0)
thresh = cv2.adaptiveThreshold(blur, 255, 1, 1, 11, 2)

#################      Now finding Contours         ###################

contours, hierarchy = cv2.findContours(thresh, cv2.RETR_LIST, cv2.CHAIN_APPROX_SIMPLE)

samples = np.empty((0, 100))
responses = []
keys = [i for i in range(48, 58)]

for cnt in contours:
    if cv2.contourArea(cnt) > 50:
        [x, y, w, h] = cv2.boundingRect(cnt)
    
        if h > 28:
            cv2.rectangle(im, (x, y), (x + w, y + h), (0, 0, 255), 2)
            roi = thresh[y:y + h, x:x + w]
            roismall = cv2.resize(roi, (10, 10))
            cv2.imshow('norm', im)
            key = cv2.waitKey(0)

            if key == 27:  # (escape to quit)
                sys.exit()
            elif key in keys:
                responses.append(int(chr(key)))
                sample = roismall.reshape((1, 100))
                samples = np.append(samples, sample, 0)

    responses = np.array(responses, np.float32)
    responses = responses.reshape((responses.size, 1))
    print
    "training complete"

    np.savetxt('generalsamples.data', samples)
    np.savetxt('generalresponses.data', responses)

I need to change the if condition on height probably but more importantly I need if conditions to get the 4 largest contours on the image. Sadly, I haven't managed to find what I'm supposed to be filtering.

This is the kind of results I get, I'm trying to escape getting those inner contours on the digit "zero"

Unprocessed images as requested: example 1 example 2

All I need is an idea on what I should filter for, don't write code please. Thank you community.


Solution

  • You almost have it. You have multiple bounding rectangles on each digit because you are retrieving every contour (external and internal). You are using cv2.findContours in RETR_LIST mode, which retrieves all the contours, but doesn't create any parent-child relationship. The parent-child relationship is what discriminates between inner (child) and outter (parent) contours, OpenCV calls this "Contour Hierarchy". Check out the docs for an overview of all hierarchy modes. Of particular interest is RETR_EXTERNAL mode. This mode fetches only external contours - so you don't get multiple contours and (by extension) multiple bounding boxes for each digit!

    Also, it seems that your images have a red border. This will introduce noise while thresholding the image, and this border might be recognized as the top-level outer contour - thus, every other contour (the children of this parent contour) will not be fetched in RETR_EXTERNAL mode. Fortunately, the border position seems constant and we can eliminate it with a simple flood-fill, which pretty much fills a blob of a target color with a substitute color.

    Let's check out the reworked code:

    # Imports:
    import cv2
    import numpy as np
    
    # Set image path
    path = "D://opencvImages//"
    fileName = "rhWM3.png"
    
    # Read Input image
    inputImage = cv2.imread(path+fileName)
    
    # Deep copy for results:
    inputImageCopy = inputImage.copy()
    
    # Convert BGR to grayscale:
    grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
    
    # Threshold via Otsu:
    threshValue, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)
    

    The first step is to get the binary image with all the target blobs/contours. This is the result so far:

    Notice the border is white. We have to delete this, a simple flood-filling at position (x=0,y=0) with black color will suffice:

    # Flood-fill border, seed at (0,0) and use black (0) color:
    cv2.floodFill(binaryImage, None, (0, 0), 0)
    

    This is the filled image, no more border!

    Now we can retrieve the external, outermost contours in RETR_EXTERNAL mode:

    # Get each bounding box
    # Find the big contours/blobs on the filtered image:
    contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    

    Notice you also get each contour's hierarchy as second return value. This is useful if you want to check out if the current contour is a parent or a child. Alright, let's loop through the contours and get their bounding boxes. If you want to ignore contours below a minimum area threshold, you can also implement an area filter:

    # Look for the outer bounding boxes (no children):
    for _, c in enumerate(contours):
    
        # Get the bounding rectangle of the current contour:
        boundRect = cv2.boundingRect(c)
    
        # Get the bounding rectangle data:
        rectX = boundRect[0]
        rectY = boundRect[1]
        rectWidth = boundRect[2]
        rectHeight = boundRect[3]
    
        # Estimate the bounding rect area:
        rectArea = rectWidth * rectHeight
    
        # Set a min area threshold
        minArea = 10
    
        # Filter blobs by area:
        if rectArea > minArea:
    
            # Draw bounding box:
            color = (0, 255, 0)
            cv2.rectangle(inputImageCopy, (int(rectX), int(rectY)),
                          (int(rectX + rectWidth), int(rectY + rectHeight)), color, 2)
            cv2.imshow("Bounding Boxes", inputImageCopy)
    
            # Crop bounding box:
            currentCrop = inputImage[rectY:rectY+rectHeight,rectX:rectX+rectWidth]
            cv2.imshow("Current Crop", currentCrop)
            cv2.waitKey(0)
    

    The last three lines of the above snippet crop and show the current digit. This is the result of detected bounding boxes for both of your images (the bounding boxes are colored in green, the red border is part of the input images):