Search code examples
videovideo-streamingh.264video-capturems-media-foundation

How to record H.264 video source directly to file in Windows.Media.Capture


I want to develop a Computer Vision algorithm that takes webcam video as input. For this I need to record a training dataset of videos in the same format as the images I’ll be getting in production.

I’m concerned that encoding video files in a lossy format, then decoding them for training will degrade and otherwise change the training images so they won’t be exactly the same as the images I’ll see in production.

Now, I see that my webcam (running on Surface Pro 3) has H264 video sources and YUY2 video sources.

So I figure – the H264 is the source of the images and the YUY2 are probably decoded images. If I record the H264 directly to a file, and later decode that file, then the decoded images will be equivalent to what I’d get from the YUY2 video source. I did not encode new video thereby changing it, but rather used the source H264 – which was already encoded.

My questions are:

  1. Is this assumption true? Is the H264 feed the source and YUY2 the product of that source? How can I check?

  2. How do I record from the H264 video source directly to files without decoding and re-encoding?

I’m using the new Windows.Media.Capture API – but I’ll use other APIs if necessary.


Solution

  • Surface Pro 3 (unlike Surface Pro 4 by the way) is equipped with a camera capable of H.264 hardware compression, which is the reason you are seeing both YUY2 and H264 as available options.

    It is the camera which compresses video, so YUY2 is the raw feed and H264 is its derivative.

    Is this assumption true? Is the H264 feed the source and YUY2 the product of that source? How can I check?

    No, it is the opposite.

    How do I record from the H264 video source directly to files without decoding and re-encoding?

    Read H264 and route it to the multiplexer to produce MP4 files with H264 video track. It is definitely possible with Windows Media Foundation's Source Reader and Sink Writer, possibly also with Media Session API, and I am not sure about Windows.Media.Capture which is presumably a layer on top of the mentioned.