Search code examples
avfoundationosx-mavericksvideo-capturedecodingcompression

h.264 data packets to 'realtime' playback/preview using Apple's VideoToolbox


From the Apple Documentation, the Quicktime framework is deprecated in OSX 10.9 in favor of AVFoundations and AVKit. For reasons I am not sure of, most of the documentation neglects to mention that some of the Quicktime framework replacement functionality is covered by a Framework called VideoToolbox. That replacement functionality includes decoding and decompressing among others.

I would like to decode and decompress h.264 encoded video data packets (NAL packets, TS packet, ect…), put them in a pixel buffer and then use Core Video and OpenGL to display the video as it comes in.

I am getting the video data packets from and encoding box via usb. This box does not show up when I run [AVCaptureDevice devices]. So I can not use most of AVFoundation (to my knowledge) to interface directly with the box. However, there is an api that comes with the box, that gives me access the video data packet files. I can write them to disk and create a video that can be played by quicktime. But doing a realtime playback is the issue. Thus the question of decoding, decompression, and creating a pixel buffer so I can use Core Video and OpenGL.

I think if I can create a pixel buffer I may be able to use AVAssetWriterInputPixelBufferAdaptor and figure out some way to get that into a AVCaptureSession. If I can do that, I should be able forgo using OpenGL and use the tools afforded me in AVFoundations and AVKit.

Also, from my reading of the AVFoundations documentation, every time they talk about streams of video/audio data, they are talking about one of two things; either a streams coming from a AVCaptureDevice or processing a stream from HTTP Live Stream. Like I said before, the box that produces the video data packets does not show up as a AVCaptureDevice. And, I would rather not build/implement a HTTP Live Streaming server if I do not need to. (Hopefully, I do not need to although I saw online some people did.)

Any help would greatly be appreciated.

Thanks!


Solution

  • Ok, it has been a while, but I finally figure out how to use VideoToolbox correctly with a raw uncompressed and encoded data stream.

    Basically, I had to familiarize myself with the H264 specifications and got much help from this great post.

    Here are the steps

    1. Make sure you get your Sequence Parameter Set (SPS) and Picture Parameter Set (PPS) before you start processing any data.
    2. Use the SPS and PPS to get the necessary data to create an avcc atom header. See post I linked to above.
    3. Save the avcc atom header in NSData.
    4. Create a CMVideoFormatDescription with avcc atom and configured extentions. See CMVideoFormatDescriptionCreate documentation.
    5. Set up VTDecompressionOutputCallbackRecord
    6. Set pixelBufferAttributes that will be used in VTDecompressionSessionCreate.
    7. Create a CMBlockBuffer from the data that was not used in creating CMVideoFormatDescription. See CMBlockBufferCreateWithMemoryBlock. Basically, you want to make sure you are adding you raw nal packets that are not SPS or PPS. You may need to add the size of the current nal packet + 4, for everything to work right. Again refer to link above.
    8. Create CMBlockBuffer
    9. Create CMSampleBuffer
    10. Use CMSampleBuffer in VTDecompressionSessionDecodeFrame to do the decoding.
    11. Run VTDecompressionSessionWaitForAsynchronousFrames after VTDecompressionSessionDecodeFrame. I noticed if I did not run VTDecompressionSessionWaitForAsynchronousFrames, my display output was jittery.
    12. What ever functionality you defined for the function used in VTDecompressionOutputCallbackRecord will get called. Presently, I am passing a CVPixelBufferRef to OpenGL to write the video to the screen. Maybe at some point I will try to use AVFoundations to write to the screen.

    I hope this helps someone.