Search code examples
opencvgeometrycomputer-visionkinectopenni

Using Kinect to detect objects on the floor


Lets say that I have a kinect pointing at a floor.

If I place 3 or 4 objects on the floor how can I determine the plane those objects are on?

How can I detect brightly colored objects on that floor?


Solution

  • Kinect returns to you a matrix deph-map that represents the distance of any surface to the sensor and following the pinhole camera model it's possible to align each depth measure with a correspondent RGB value. I will consider that you already know how to correlate each pixel of the depth matrix with it's X,Y,Z on the space and it's RGB value. If not, you will need to do further research and understand how the stereo correlation is done between the depth sensor and the RGB camera.

    You asked two completely different questions here. The first one is easily solvable with some basic geometry notion, but is necessary to solve the second one first to find the object's position on space.

    There is several approaches to find the brightly colored objects. If your sensor will record a static scene, it's possible to use Background Subsctraction. This will produce you a binary image representing the pixels with different values from a previously trained background model. As your objects will explicitly have brighter colors than the background, you can simple apply a Thresholding Segmentation. Just convert the RGB to a HSL image and look to higher Luminance values. There are several other methods, research for them if those don't solve your problem. Following both of those methods will return to you a binary image with blobs. You can use the center of those blobs as the matrix coordinates of your brightly colored objects.

    With 3 center blobs A', B' and C' you will be able to find the plane that you are looking for, as represented in the picture below:

    Finding the plane

    Explanation: A plane can be represented as a point(position) and a normal(orientation). Considering that all your objects will be exactly at the plane you are trying to find, all you need is 3 points, A, B, C that will represent a triangle inside that plane. This triangle normal is equal to (A - B) x (C - B) - here x represents the cross product - and is the same as the plane. So, your plane will be anyone of those 3 points and that triangle normal. If the object's dimensions are significant, you will need to take them in consideration to define your plane position.