Search code examples
pythonpython-3.xopencvimage-processingkalman-filter

OpenCV & Python - How does one filter noise from an irregular shaped polygon detected by OpenCV using a Kalman filter?


I have a small tracking project that I am working on. I have my frame by frame detection scheme setup and working. When I run I get a fair amount of noise in the polygon that I extract even if the scene is static. Since I want this run in real time, it seems Kalman filtering is the best way to solve this problem; however implementation details are sparse. I have seen some examples via google, but they typically deal with bounding boxes or regular shapes, which are described with only a few bits of info. I am not sure that approach would work.

I am interested in tracking the evolution of a more irregular geometry below. It takes ~100 points or more to describe the polygon. How can I adapt the OpenCV kalman tools to handle this task?

Thanks in advance.

** Update **

So additional details. I need to have an accurate profile of the object for downstream analysis so a bounding box is not an option. My camera can produce frames at 30 fps, but I do not need to process that fast, though I do not want to only process 1 a second either. Doing a fast de-noising operation is too slow. My images are 4024x3036 monochrome images. I attached jpeg versions of six shots of my scene. The sample is the small chunk in the center of the two plates in the bottom third of the image. I also attached what I am looking to pull from each frame an irregular polygon that matches the 2d profile of the shape accurately. I will favor accuracy and stability over speed but I would like to process a few frames per second.

I will go capture some representative images or small movie and will post shortly.

Thanks in advance.

Sample Images

Shot 1

Shot 2

Shot 3

Shot 4

Shot 5

Shot 6

The goal

Example of what I am looking for


Solution

  • The Concept

    Notice how among the columns of the images, the columns where the purple lines should go have the most black? We can detect the ROI (region of interest) by first detecting the first and last columns with at least certain amount of black. Then detect the rows between the 2 detected columns where the white color first starts and first ends at the 2 columns.

    The Code

    import cv2
    import numpy as np
    
    files = [f"img{i}.jpg" for i in range(1, 6)]
    
    for file in files:
        img = cv2.imread(file)
        gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        _, thresh = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)
        sum_cols = thresh.sum(0)
        indices = np.where(sum_cols < sum_cols.min() + 40000)[0]
        x1, x2 = indices[0] - 50, indices[-1] + 50
        diff1, diff2 = np.diff(thresh[:, [x1, x2]].T, 1)
        y1_1, y2_1 = np.where(diff1)[0][:2]
        y1_2, y2_2 = np.where(diff2)[0][:2]
        y1, y2 = min(y1_1, y1_2), max(y2_1, y2_2)
        img_canny = cv2.Canny(thresh[y1: y2, x1: x2], 50, 50)
        contours, _ = cv2.findContours(img_canny, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
        cv2.line(img, (x1, y1_1), (x2, y1_2), (255, 0, 160), 5)
        cv2.line(img, (x1, y2_1), (x2, y2_2), (255, 0, 160), 5)
        cv2.drawContours(img[y1: y2, x1: x2], contours, -1, (0, 0, 255), 10)
        cv2.imshow("Image", img)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
    

    The Output

    Here are what the program will output for each different image you provided:

    enter image description here

    enter image description here

    enter image description here

    enter image description here

    enter image description here

    The Explanation

    1. Import the necessary libraries:
    import cv2
    import numpy as np
    
    1. Since you have a camera, obviously you're going to use the cv2.VideoCapture() method. As I only have the images you provided, I'll make the program read in each image. So, store every image filename into a list (I have img1.jpg, img1.jpg, ... img5.jpg), iterate through the names and read in each image:
    files = [f"img{i}.jpg" for i in range(1, 6)]
    
    for file in files:
        img = cv2.imread(file)
    
    1. Convert each image into grayscale, and use the cv2.threshold() method to convert the grayscale images to have only 2 values; 0 for each pixel that's less or equal to 127, and 255 for each pixel that's more than 127:
        gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        _, thresh = cv2.threshold(gray, 127, 255, cv2.THRESH_BINARY)
    
    1. In order to find the columns with the most 0s (which means with most black), we'll need to find the sum of every column, where the smallest sum will be from the column with the most 0s. With the sum of each column, we can use the np.where() method to find the index of every column in the thresholded image that sums up to a number close to the smallest sum detected. Then, we can get the first index and the last index of the detected columns to be the x1 and x2 of our ROI (along with a padding of 50 pixels):
        sum_cols = thresh.sum(0)
        indices = np.where(sum_cols < sum_cols.min() + 40000)[0]
        x1, x2 = indices[0] - 50, indices[-1] + 50
    
    1. In order to find the y1 and y2 of the top lines, we'll need to detect the index of the first occurrence of a change from 0 to 255 in the first edge of the detected columns and in the last edge of the detected columns. Similarly, in order to find the y1 and y2 of the bottom line, we'll need to detect the index of the first occurrence of a change from 255 to 0 in the first edge of the detected columns and in the last edge of the detected columns. Finally, with our 4 y coordinates, we can get the y1 and y2 of our ROI by getting the smallest of the y coordinates in the first line, and the greatest of the y coordinates in the second line:
        diff1, diff2 = np.diff(thresh[:, [x1, x2]].T, 1)
        y1_1, y2_1 = np.where(diff1)[0][:2]
        y1_2, y2_2 = np.where(diff2)[0][:2]
        y1, y2 = min(y1_1, y1_2), max(y2_1, y2_2)
    
    1. Now we have our ROI. We can detect the edges of the objects within out ROI using the Canny edge detector, and detect the contours of the edges using the cv2.findContours() method:
        img_canny = cv2.Canny(thresh[y1: y2, x1: x2], 50, 50)
        contours, _ = cv2.findContours(img_canny, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    
    1. Finally, we can draw on the lines and the contours onto the non-binary images, and show the images:
        cv2.line(img, (x1, y1_1), (x2, y1_2), (255, 0, 160), 5)
        cv2.line(img, (x1, y2_1), (x2, y2_2), (255, 0, 160), 5)
        cv2.drawContours(img[y1: y2, x1: x2], contours, -1, (0, 0, 255), 10)
        cv2.imshow("Image", img)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break