Search code examples
c++winapivideoframe-ratems-media-foundation

Media Foundation get exact frame ( sample ) count of video file


I have a video file and I have successfully extracted the average frameRate and duration:

UINT64 frameRate = 0;
mediaType->GetUINT64(MF_MT_FRAME_RATE, &frameRate)

PROPVARIANT prop;
m_pReader->GetPresentationAttribute(MF_SOURCE_READER_MEDIASOURCE,
    MF_PD_DURATION, &prop);
double duration = prop.hVal.QuadPart / 1e7;

Now I am simply calculating the number of frames: frameCount = duration*frameRate

And I compare it with the actual number of samples by iterating over them:

realFrameCount = 0;
while(true)
{
    m_pReader->ReadSample(
            MF_SOURCE_READER_FIRST_VIDEO_STREAM,
            0, &streamIndex, index.  &flags, &llTimeStamp, &videoSample);

        if (flags & MF_SOURCE_READERF_ENDOFSTREAM)
            break;

    realFrameCount++;
}

In some cases, I get an exact match, but for some files, I do get it within a small difference.

My question is: Is there a way to get exact number of frames without iterating over all samples?

Thanks


Solution

  • You are assuming that you are dealing with a fixed frame rate file and the metadata is accurate, and them then the file does not have dropped/missing frames. Sometimes it is the case and your multiplication results in exact value, at other times the assumption is incorrect. In general, you have duration and frame time value read from file metadata and it does not have to match to actual number and parameters of the frame you actually read. Metadata is rather informational.

    So short answer is: you have to go through the the file reading, skipping the data, counting samples.

    Standard positioning in Media Foundation supposes that use use time format in time units and scale of 100 ns per unit. IMFSourceReader::SetCurrentPosition, for once, explains it this way:

    guidTimeFormat [in]

    A GUID that specifies the time format. The time format defines the units for the varPosition parameter. The following value is defined for all media sources:

    GUID_NULL - 100-nanosecond units.

    Some media sources might support additional values.

    That is, some sources, format specific that is, might support other time formats such as video frames, in which case you can refer specific frame by its ordinal number and get duration in frames. I am not aware if any stock sources actually offer this functionality. In predecessor API, there was TIME_FORMAT_FRAME for this purpose, and I think it was omitted with transition to Media Foundation.