Search code examples
computer-visionkinectmotion-detection

Kinect as Motion Sensor


I'm planning on creating an app that does something like this: http://www.zonetrigger.com/articles/Kinect-software/

That means, I want to be able to set up "Trigger Zones" using the Kinect and it's 3d Image. Now I know that Microsoft is stating that the Kinect can detect the skeleton of up to 6 People. For me however, it would be enough to detect whether something is entering a trigger zone and where.

Does anyone know if the Kinect can be programmed to function as a simple Motion Sensor, so it can detect more than 6 entries?


Solution

  • It is well known that Kinect cannot detect more than 5 entries (just kidding). All you need to do is to get a depth map (z-map) from Kinect, and then convert it into a 3d map using these formulas,

    X = (((cols - cap_width) * Z ) / focal_length_X);
    Y = (((row - cap_height)* Z ) / focal_length_Y);
    Z = Z; 
    

    Where row and col are calculated from the image center (not upper left corner!) and focal is a focal length of Kinect in pixels (~570). Now you can specify the exact locations in 3D where if the pixels appear, you can do whatever you want to do. Here are more pointers:

    1. You can use openCV for the ease of visualization. To read a frame from Kinect after it was initialized you just need something like this:

      Mat inputMat = Mat(h, w, CV_16U, (void*) depth_gen.GetData());

    2. You can easily visualize depth maps using histogram equalization (it will optimally spread 10000 Kinect levels among your available 255 levels of grey)

    3. It is sometimes desirable to do object segmentation grouping spatially close pixels with similar depth together. I did this several years ago, see this but had to delete the floor and/or common surface on which object stayed otherwise all the object were connected and extracted as a single large segment.