encoding video-encoding ms-media-foundation

Encoding Video and Audio together is much slower than encoding each separately

Encoding audio and video separately with Media Foundation then multiplexing with ffmpeg, is much faster than encoding both audio and video together, multiplexing in Media Foundation. I'm curious to know why?

I'm encoding aac audio and h264 video; the video input into the encoder is at variable frame rate (quite low). The speed I'm getting is probably what I would expect if the video input was at a constant frame rate (same as that of the output).

Notably, if I write all the audio(video) stream then the video(audio) stream, all the time encoding is taken writing the first stream, and the second stream is written near instantly.

Can some one tell me what is going on?

Solution

The multipexer is likely to be throttling the video leg of the input because the produced file has, with default settings at least, the layout where video and audio data related to close time points is packaged together.

By default, the sink writer's IMFSinkWriter::WriteSample method limits the data rate by blocking the calling thread. This prevents the application from delivering samples too quickly. To disable this behavior, set the MF_SINK_WRITER_DISABLE_THROTTLING attribute to TRUE when you create the sink writer.