Search code examples
pythonopencvimage-processingcomputer-visionscikit-image

How to infer the state of a shape from colors


  • I have Lego cubes forming 4x4 shape, and I'm trying to infer the status of a zone inside the image:

empty/full and the color whether if yellow or Blue.

  • to simplify my work I have added red marker to define the border of the shape since the camera is shaking sometimes.
  • Here is a clear image of the shape I'm trying to detect taken by my phone camera

( EDIT : Note that this image is not my input image, it is used just to demonstrate the required shape clearly ).

shape

  • The shape from the side camera that I'm supposed to use looks like this:

(EDIT : Now this is my input image)

side

  • to focus my work on the working zone I have created a mask:

mask

  • what I have tried so far is to locate the red markers by color (simple threshold without HSV color-space) as following:
import numpy as np
import matplotlib.pyplot as plt
import cv2

img = cv2.imread('sample.png')
RGB = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)

mask = cv2.imread('mask.png')
masked = np.minimum(RGB, mask)

masked[masked[...,1]>25] = 0
masked[masked[...,2]>25] = 0
masked = masked[..., 0]

masked = cv2.medianBlur(masked,5)

plt.imshow(masked, cmap='gray')
plt.show()

and I have spotted the markers so far:

marker

But I'm still confused:

how to detect the external borders of the desired zone, and the internal borders (each Lego cube(Yellow-Blue-Green) borders) inside the red markers precisely?.

thanks in advance for your kind advice.


Solution

  • I tested this approach using your undistorted image. Suppose you have the rectified camera image, so you see the lego bricks through a "bird's eye" perspective. Now, the idea is to use the red markers to estimate a center rectangle and crop that portion of the image. Then, as you know each brick's dimensions (and they are constant) you can trace a grid and extract each cell of the grid, You can compute some HSV-based masks to estimate the dominant color on each grid, and that way you know if the space is occupied by a yellow or blue brick, of it is empty.

    These are the steps:

    1. Get an HSV mask of the red markers
    2. Use each marker to estimate the center rectangle through each marker's coordinates
    3. Crop the center rectangle
    4. Divide the rectangle into cells - this is the grid
    5. Run a series of HSV-based maks on each cell and compute the dominant color
    6. Label each cell with the dominant color

    Let's see the code:

    # Importing cv2 and numpy:
    import numpy as np
    import cv2
    
    # image path
    path = "D://opencvImages//"
    fileName = "Bg9iB.jpg"
    
    # Reading an image in default mode:
    inputImage = cv2.imread(path + fileName)
    # Store a deep copy for results:
    inputCopy = inputImage.copy()
    
    # Convert the image to HSV:
    hsvImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2HSV)
    
    # The HSV mask values (Red):
    lowerValues = np.array([127, 0, 95])
    upperValues = np.array([179, 255, 255])
    
    # Create the HSV mask
    mask = cv2.inRange(hsvImage, lowerValues, upperValues)
    

    The first part is very straightforward. You set the HSV range and use cv2.inRange to get a binary mask of the target color. This is the result:

    We can further improve the binary mask using some morphology. Let's apply a closing with a somewhat big structuring element and 10 iterations. We want those markers as clearly defined as possible:

    # Set kernel (structuring element) size:
    kernelSize = 5
    # Set operation iterations:
    opIterations = 10
    # Get the structuring element:
    maxKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (kernelSize, kernelSize))
    # Perform closing:
    mask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, maxKernel, None, None, opIterations, cv2.BORDER_REFLECT101)
    

    Which yields:

    Very nice. Now, let's detect contours on this mask. We will approximate each contour to a bounding box and store its starting point and dimensions. The idea being that, while we will detect every contour, we are not sure of their order. We can sort this list later and get each bounding box from left to right, top to bottom to better estimate the central rectangle. Let's detect contours:

    # Create a deep copy, convert it to BGR for results:
    maskCopy = mask.copy()
    maskCopy = cv2.cvtColor(maskCopy, cv2.COLOR_GRAY2BGR)
    
    # Find the big contours/blobs on the filtered image:
    contours, hierarchy = cv2.findContours(mask, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
    
    # Bounding Rects are stored here:
    boundRectsList = []
    
    # Process each contour 1-1:
    for i, c in enumerate(contours):
    
        # Approximate the contour to a polygon:
        contoursPoly = cv2.approxPolyDP(c, 3, True)
    
        # Convert the polygon to a bounding rectangle:
        boundRect = cv2.boundingRect(contoursPoly)
    
        # Get the bounding rect's data:
        rectX = boundRect[0]
        rectY = boundRect[1]
        rectWidth = boundRect[2]
        rectHeight = boundRect[3]
    
        # Estimate the bounding rect area:
        rectArea = rectWidth * rectHeight
    
        # Set a min area threshold
        minArea = 100
    
        # Filter blobs by area:
        if rectArea > minArea:
            #Store the rect:
            boundRectsList.append(boundRect)
    

    I also created a deep copy of the mask image for further use. Mainly to create this image, which is the result of the contour detection and bounding box approximation:

    Notice that I have included a minimum area condition. I want to ignore noise below a certain threshold defined by minArea. Alright, now we have the bounding boxes in the boundRectsList variable. Let's sort this boxes using the Y coordinate:

    # Sort the list based on ascending y values:
    boundRectsSorted = sorted(boundRectsList, key=lambda x: x[1])
    

    The list is now sorted and we can enumerate the boxes from left to right, top to bottom. Like this: First "row" -> 0, 1, Second "Row" -> 2, 3. Now, we can define the big, central, rectangle using this info. I call these "inner points". Notice the rectangle is defined as function of all the bounding boxes. For example, its top left starting point is defined by bounding box 0's bottom right ending point (both x and y). Its width is defined by bounding box 1's bottom left x coordinate, height is defined by bounding box 2's rightmost y coordinate. I'm gonna loop through each bounding box and extract their relevant dimensions to construct the center rectangle in the following way: (top left x, top left y, width, height). There's more than one way yo achieve this. I prefer to use a dictionary to get the relevant data. Let's see:

    # Rectangle dictionary:
    # Each entry is an index of the currentRect list
    # 0 - X, 1 - Y, 2 - Width, 3 - Height
    # Additionally: -1 is 0 (no dimension):
    pointsDictionary = {0: (2, 3),
                        1: (-1, 3),
                        2: (2, -1),
                        3: (-1, -1)}
    
    # Store center rectangle coordinates here:
    centerRectangle = [None]*4
    
    # Process the sorted rects:
    rectCounter = 0
    
    for i in range(len(boundRectsSorted)):
    
        # Get sorted rect:
        currentRect = boundRectsSorted[i]
    
        # Get the bounding rect's data:
        rectX = currentRect[0]
        rectY = currentRect[1]
        rectWidth = currentRect[2]
        rectHeight = currentRect[3]
    
        # Draw sorted rect:
        cv2.rectangle(maskCopy, (int(rectX), int(rectY)), (int(rectX + rectWidth),
                                 int(rectY + rectHeight)), (0, 255, 0), 5)
    
        # Get the inner points:
        currentInnerPoint = pointsDictionary[i]
        borderPoint = [None]*2
    
        # Check coordinates:
        for p in range(2):
            # Check for '0' index:
            idx = currentInnerPoint[p]
            if idx == -1:
                borderPoint[p] = 0
            else:
                borderPoint[p] = currentRect[idx]
    
        # Draw the border points:
        color = (0, 0, 255)
        thickness = -1
        centerX = rectX + borderPoint[0]
        centerY = rectY + borderPoint[1]
        radius = 50
        cv2.circle(maskCopy, (centerX, centerY), radius, color, thickness)
    
        # Mark the circle
        org = (centerX - 20, centerY + 20)
        font = cv2.FONT_HERSHEY_SIMPLEX
        cv2.putText(maskCopy, str(rectCounter), org, font,
                2, (0, 0, 0), 5, cv2.LINE_8)
    
        # Show the circle:
        cv2.imshow("Sorted Rects", maskCopy)
        cv2.waitKey(0)
    
        # Store the coordinates into list
        if rectCounter == 0:
            centerRectangle[0] = centerX
            centerRectangle[1] = centerY
        else:
            if rectCounter == 1:
                centerRectangle[2] = centerX - centerRectangle[0]
            else:
                if rectCounter == 2:
                    centerRectangle[3] = centerY - centerRectangle[1]
        # Increase rectCounter:
        rectCounter += 1
    

    This image shows each inner point with a red circle. Each circle is enumerated from left to right, top to bottom. The inner points are stored in the centerRectangle list:

    If you join each inner point you get the center rectangle we have been looking for:

    # Check out the big rectangle at the center:
    bigRectX = centerRectangle[0]
    bigRectY = centerRectangle[1]
    bigRectWidth = centerRectangle[2]
    bigRectHeight = centerRectangle[3]
    # Draw the big rectangle:
    cv2.rectangle(maskCopy, (int(bigRectX), int(bigRectY)), (int(bigRectX + bigRectWidth),
                         int(bigRectY + bigRectHeight)), (0, 0, 255), 5)
    cv2.imshow("Big Rectangle", maskCopy)
    cv2.waitKey(0)
    

    Check it out:

    Now, just crop this portion of the original image:

    # Crop the center portion:
    centerPortion = inputCopy[bigRectY:bigRectY + bigRectHeight, bigRectX:bigRectX + bigRectWidth]
    
    # Store a deep copy for results:
    centerPortionCopy = centerPortion.copy()
    

    This is the central portion of the image:

    Cool, now let's create the grid. You know that there must be 4 bricks per width and 4 bricks per height. We can divide the image using this info. I'm storing each sub-image, or cell, in a list. I'm also estimating each cell's center, for additional processing. These are stored in a list too. Let's see the procedure:

    # Dive the image into a grid:
    verticalCells = 4
    horizontalCells = 4
    
    # Cell dimensions
    cellWidth = bigRectWidth / verticalCells
    cellHeight = bigRectHeight / horizontalCells
    
    # Store the cells here:
    cellList = []
    
    # Store cell centers here:
    cellCenters = []
    
    # Loop thru vertical dimension:
    for j in range(verticalCells):
    
        # Cell starting y position:
        yo = j * cellHeight
    
        # Loop thru horizontal dimension:
        for i in range(horizontalCells):
    
            # Cell starting x position:
            xo = i * cellWidth
    
            # Cell Dimensions:
            cX = int(xo)
            cY = int(yo)
            cWidth = int(cellWidth)
            cHeight = int(cellHeight)
    
            # Crop current cell:
            currentCell = centerPortion[cY:cY + cHeight, cX:cX + cWidth]
    
            # into the cell list:
            cellList.append(currentCell)
    
            # Store cell center:
            cellCenters.append((cX + 0.5 * cWidth, cY + 0.5 * cHeight))
    
            # Draw Cell
            cv2.rectangle(centerPortionCopy, (cX, cY), (cX + cWidth, cY + cHeight), (255, 255, 0), 5)
    
        cv2.imshow("Grid", centerPortionCopy)
        cv2.waitKey(0)
    

    This is the grid:

    Let's now process each cell individually. Of course, you can process each cell on the last loop, but I'm not currently looking for optimization, clarity is my priority. We need to generate a series of HSV masks with the target colors: yellow, blue and green (empty). I prefer to, again, implement a dictionary with the target colors. I'll generate a mask for each color and I'll count the number of white pixels using cv2.countNonZero. Again, I set a minimum threshold. This time of 10. With this info I can determine which mask generated the maximum number of white pixels, thus, giving me the dominant color:

    # HSV dictionary - color ranges and color name:
    colorDictionary = {0: ([93, 64, 21], [121, 255, 255], "blue"),
                       1: ([20, 64, 21], [30, 255, 255], "yellow"),
                       2: ([55, 64, 21], [92, 255, 255], "green")}
    
    # Cell counter:
    cellCounter = 0
    
    for c in range(len(cellList)):
    
        # Get current Cell:
        currentCell = cellList[c]
        # Convert to HSV:
        hsvCell = cv2.cvtColor(currentCell, cv2.COLOR_BGR2HSV)
    
        # Some additional info:
        (h, w) = currentCell.shape[:2]
    
        # Process masks:
        maxCount = 10
        cellColor = "None"
    
        for m in range(len(colorDictionary)):
    
            # Get current lower and upper range values:
            currentLowRange = np.array(colorDictionary[m][0])
            currentUppRange = np.array(colorDictionary[m][1])
    
            # Create the HSV mask
            mask = cv2.inRange(hsvCell, currentLowRange, currentUppRange)
    
            # Get max number of target pixels
            targetPixelCount = cv2.countNonZero(mask)
            if targetPixelCount > maxCount:
                maxCount = targetPixelCount
                # Get color name from dictionary:
                cellColor = colorDictionary[m][2]
    
        # Get cell center, add an x offset:
        textX = int(cellCenters[cellCounter][0]) - 100
        textY = int(cellCenters[cellCounter][1])
    
        # Draw text on cell's center:
        font = cv2.FONT_HERSHEY_SIMPLEX
        cv2.putText(centerPortion, cellColor, (textX, textY), font,
                        2, (0, 0, 255), 5, cv2.LINE_8)
    
        # Increase cellCounter:
        cellCounter += 1
    
        cv2.imshow("centerPortion", centerPortion)
        cv2.waitKey(0)
    

    This is the result:

    From here it is easy to identify the empty spaces on the grid. What I didn't cover was the perspective rectification of your distorted image, but there's plenty of info on how to do that. Hope this helps you out!

    Edit:

    If you want to apply this approach to your distorted image you need to undo the fish-eye and the perspective distortion. Your rectified image should look like this:

    You probably will have to tweak some values because some of the distortion still remains, even after rectification.