Search code examples
ffmpegframe-rate

What does ffmpeg think is the difference between an audio frame and audio sample?


Here's a curious option listed in the man pages of ffmpeg:

-aframes number (output)
    Set the number of audio frames to output. This is an obsolete alias for "-frames:a", which you should use instead.

What an 'audio frame' is seems dubious to me. This SO answer says that frame is synonymous with sample, but that can't be what ffmpeg thinks a frame is. Just look at this example when I resample some audio to 22.05 kHz and a length of exactly 313 frames:

$ ffmpeg -i input.mp3 -frames:a 313 -ar:a 22.05K output.wav

If 'frame' and 'sample' were synonymous, we would expect audio duration to be 0.014 seconds, but the actual duration is 8 seconds. ffmpeg thinks the frame rate of my input is 39.125.

What's going on here? What does ffmpeg think an audio frame really is? How do I go about finding this frame rate of my input audio?


Solution

  • A "frame" is a bit of an overloaded term here.

    In PCM, a frame is a set of samples occurring at the same time. If your audio were 22.05 kHz and you had 313 PCM frames, it's length in time would be about 14 milliseconds, as you expect.

    However, your audio isn't PCM... it's MP3. An MP3 frame is about 26 milliseconds long. 313 of them add up to about 8 seconds. The frame here is a block of audio that cannot be decoded independently. (In fact, some frames actually depend on other frames via the bit reservoir!)