Search code examples
pythonopencvimage-processingcomputer-visionflood-fill

Flood fill function not producing good results


enter image description hereenter image description hereI applied the floodfill function in opencv to extract the foreground from the background but some of the objects in the image were not recognized by the algorithm so I would like to know how I can improve my detections and what modifications are necessary.

image = cv2.imread(args["image"])
image = cv2.resize(image, (800, 800))
h,w,chn = image.shape
ratio = image.shape[0] / 800.0
orig = image.copy()

gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (5, 5), 0)
edged = cv2.Canny(gray, 75, 200)
# show the original image and the edge detected image
print("STEP 1: Edge Detection")
cv2.imshow("Image", image)
cv2.imshow("Edged", edged)

warped1 = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
T = threshold_local(warped1, 11, offset = 10, method = "gaussian")
warped1 = (warped1 > T).astype("uint8") * 255
print("STEP 3: Apply perspective transform")

seed = (10, 10)

foreground, birdEye = floodFillCustom(image, seed)
cv2.circle(birdEye, seed, 50, (0, 255, 0), -1)
cv2.imshow("originalImg", birdEye)

cv2.circle(birdEye, seed, 100, (0, 255, 0), -1)

cv2.imshow("foreground", foreground)
cv2.imshow("birdEye", birdEye)

gray = cv2.cvtColor(foreground, cv2.COLOR_BGR2GRAY)
cv2.imshow("gray", gray)
cv2.imwrite("gray.jpg", gray)

threshImg = cv2.threshold(gray, 1, 255, cv2.THRESH_BINARY)[1]
h_threshold,w_threshold = threshImg.shape
area = h_threshold*w_threshold

cv2.imshow("threshImg", threshImg)[![enter image description here][1]][1]

The floodFillCustom function is as follows -

def floodFillCustom(originalImage, seed):

    originalImage = np.maximum(originalImage, 10)
    foreground = originalImage.copy()

    cv2.floodFill(foreground, None, seed, (0, 0, 0),
                  loDiff=(10, 10, 10), upDiff=(10, 10, 10))

    return [foreground, originalImage]

[1]: https://i.sstatic.net/69UUh.jpg


Solution

  • A little bit late, but here's an alternative solution for segmenting the tools. It involves converting the image to the CMYK color space and extracting the K (Key) component. This component can be thresholded to get a nice binary mask of the tools, the procedure is very straightforward:

    1. Convert the image to the CMYK color space
    2. Extract the K (Key) component
    3. Threshold the image via Otsu's thresholding
    4. Apply some morphology (a closing) to clean up the mask
    5. (Optional) Get bounding rectangles of all the tools

    Let's see the code:

    # Imports
    import cv2
    import numpy as np
    
    # Read image
    imagePath = "C://opencvImages//"
    inputImage = cv2.imread(imagePath+"DAxhk.jpg")
    
    # Create deep copy for results:
    inputImageCopy = inputImage.copy()
    
    # Convert to float and divide by 255:
    imgFloat = inputImage.astype(np.float) / 255.
    
    # Calculate channel K:
    kChannel = 1 - np.max(imgFloat, axis=2)
    
    # Convert back to uint 8:
    kChannel = (255*kChannel).astype(np.uint8)
    

    The first step is to convert the BGR image to CMYK. There's no direct conversion in OpenCV for this, so I applied directly the conversion formula. We can get every color space component from that formula, but we are only interested on the K channel. The conversion is easy, but we need to be careful with the data types. We need to operate on float arrays. After getting the K channel, we convert back the image to an unsigned 8-bit array, this is the resulting image:

    Let's threshold this image using Otsu's thresholding method:

    # Threshold via Otsu:
    _, binaryImage = cv2.threshold(kChannel, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
    

    This yields the following binary image:

    Looks very nice! Additionally, we can clean it up a little bit (joining the little gaps) using a morphological closing. Let's apply a rectangular structuring element of size 5 x 5 and use 2 iterations:

    # Use a little bit of morphology to clean the mask:
    # Set kernel (structuring element) size:
    kernelSize = 5
    # Set morph operation iterations:
    opIterations = 2
    # Get the structuring element:
    morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
    # Perform closing:
    binaryImage = cv2.morphologyEx(binaryImage, cv2.MORPH_CLOSE, morphKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
    

    Which results in this:

    Very cool. What follows is optional. We can get the bounding rectangles for every tool by looking for the outer (external) contours:

    # Find the contours on the binary image:
    contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    
    # Look for the outer bounding boxes (no children):
    for _, c in enumerate(contours):
    
        # Get the contours bounding rectangle:
        boundRect = cv2.boundingRect(c)
    
        # Get the dimensions of the bounding rectangle:
        rectX = boundRect[0]
        rectY = boundRect[1]
        rectWidth = boundRect[2]
        rectHeight = boundRect[3]
    
        # Set bounding rectangle:
        color = (0, 0, 255)
        cv2.rectangle( inputImageCopy, (int(rectX), int(rectY)),
                       (int(rectX + rectWidth), int(rectY + rectHeight)), color, 5 )
    
        cv2.imshow("Bounding Rectangles", inputImageCopy)
        cv2.waitKey(0)
    

    Which produces the final image: