Search code examples
kinectopenni

getUserPixels - alternative in official Kinect SDK


Is there an alternative for the getUserPixels method offered by OpenNI in the official Kinect SDK?

How would one implement this functionality with the official Kinect SDK?


Solution

  • The official Kinect for Windows SDK (v1.6) does not support a direct call, such as getUserPixels, to extract a player silhouette but does contain all the information necessary to do so.

    You can see this in action, in different ways, by examining two of the examples available from the Kinect for Windows Developer Toolkit.

    • Basic Interactions-WPF: includes a function to create a simple silhouette of the user being tracked.
    • Green Screen (-WPF, or -D2D): shows how to perform background subtraction to produce a green screen effect. In this example the data from the RGB camera is superimposed over a image.

    The two examples do this in different ways.

    • Basic Interactions will pull out a BitmapMask of from the depth data which corresponds to the requested player. This has the advantage of only showing tracked users; any object not thought to be a skeleton is ignored.
    • Green Screen does not look for a particular user, instead opting for motion. This gives the advantage silhouetting any moving object -- such as a ball being passed between two users.

    I believe the "Basic Interactions" example will show you how you implement what you are looking for. You'll have to do the work yourself, but it is possible. For example, using the "Basic Interactions" example as a base I created a UserControl that generates a simple silhouette of the user being tracked...

    When the skeleton frame is ready, I pull out the player index:

    private void OnSkeletonFrameReady(object sender, SkeletonFrameReadyEventArgs e)
    {
        using (SkeletonFrame skeletonFrame = e.OpenSkeletonFrame())
        {
            if (skeletonFrame != null && skeletonFrame.SkeletonArrayLength > 0)
            {
                if (_skeletons == null || _skeletons.Length != skeletonFrame.SkeletonArrayLength)
                {
                    _skeletons = new Skeleton[skeletonFrame.SkeletonArrayLength];
                }
    
                skeletonFrame.CopySkeletonDataTo(_skeletons);
    
                // grab the tracked skeleton and set the playerIndex for use pulling
                // the depth data out for the silhouette.
                // NOTE: this assumes only a single tracked skeleton!
                this.playerIndex = -1;
                for (int i = 0; i < _skeletons.Length; i++)
                {
                    if (_skeletons[i].TrackingState != SkeletonTrackingState.NotTracked)
                    {
                        this.playerIndex = i+1;
                    }
                }
            }
        }
    }
    

    Then, when the next depth frame is ready, I pull out BitmapMask for the user that corresponds to playerIndex.

    private void OnDepthFrameReady(object sender, DepthImageFrameReadyEventArgs e)
    {
        using (DepthImageFrame depthFrame = e.OpenDepthImageFrame())
        {
            if (depthFrame != null)
            {
                // check if the format has changed.
                bool haveNewFormat = this.lastImageFormat != depthFrame.Format;
    
                if (haveNewFormat)
                {
                    this.pixelData = new short[depthFrame.PixelDataLength];
                    this.depthFrame32 = new byte[depthFrame.Width * depthFrame.Height * Bgra32BytesPerPixel];
                    this.convertedDepthBits = new byte[this.depthFrame32.Length];
                }
    
                depthFrame.CopyPixelDataTo(this.pixelData);
    
                for (int i16 = 0, i32 = 0; i16 < pixelData.Length && i32 < depthFrame32.Length; i16++, i32 += 4)
                {
                    int player = pixelData[i16] & DepthImageFrame.PlayerIndexBitmask;
                    if (player == this.playerIndex)
                    {
                        convertedDepthBits[i32 + RedIndex] = 0x44;
                        convertedDepthBits[i32 + GreenIndex] = 0x23;
                        convertedDepthBits[i32 + BlueIndex] = 0x59;
                        convertedDepthBits[i32 + 3] = 0x66;
                    }
                    else if (player > 0)
                    {
                        convertedDepthBits[i32 + RedIndex] = 0xBC;
                        convertedDepthBits[i32 + GreenIndex] = 0xBE;
                        convertedDepthBits[i32 + BlueIndex] = 0xC0;
                        convertedDepthBits[i32 + 3] = 0x66;
                    }
                    else
                    {
                        convertedDepthBits[i32 + RedIndex] = 0x0;
                        convertedDepthBits[i32 + GreenIndex] = 0x0;
                        convertedDepthBits[i32 + BlueIndex] = 0x0;
                        convertedDepthBits[i32 + 3] = 0x0;
                    }
                }
    
                if (silhouette == null || haveNewFormat)
                {
                    silhouette = new WriteableBitmap(
                        depthFrame.Width,
                        depthFrame.Height,
                        96,
                        96,
                        PixelFormats.Bgra32,
                        null);
    
                    SilhouetteImage.Source = silhouette;
                }
    
                silhouette.WritePixels(
                    new Int32Rect(0, 0, depthFrame.Width, depthFrame.Height),
                    convertedDepthBits,
                    depthFrame.Width * Bgra32BytesPerPixel,
                    0);
    
                Silhouette = silhouette;
    
                this.lastImageFormat = depthFrame.Format;
            }
        }
    }
    

    What I end up with is a purple silhouette of the user in a WriteableBitmap, which can be copied to an Image on the control or pulled and used elsewhere. Once you have the BitmapMask you could also map the data the color stream if you wanted a to actually see the RGB data that corresponds to that area.

    You can adapt the code to simulate more closely the getUserPixels function if you like. The big part you'd be interested in would be, given a depth frame and a playerIndex:

    if (depthFrame != null)
    {
        // check if the format has changed.
        bool haveNewFormat = this.lastImageFormat != depthFrame.Format;
    
        if (haveNewFormat)
        {
            this.pixelData = new short[depthFrame.PixelDataLength];
            this.depthFrame32 = new byte[depthFrame.Width * depthFrame.Height * Bgra32BytesPerPixel];
            this.convertedDepthBits = new byte[this.depthFrame32.Length];
        }
    
        depthFrame.CopyPixelDataTo(this.pixelData);
    
        for (int i16 = 0, i32 = 0; i16 < pixelData.Length && i32 < depthFrame32.Length; i16++, i32 += 4)
        {
            int player = pixelData[i16] & DepthImageFrame.PlayerIndexBitmask;
            if (player == this.playerIndex)
            {
                // this pixel "belongs" to the user identified in "playerIndex"
            }
            else
            {
                // not the requested user
            }
        }
    }