Search code examples
c++videodirectshowavi

SetMediaTime on a CSource filter makes output AVI nonsense - any idea why?


Update: The code I originally posted did not actually reproduce the issue; my sincere apologies for not validating it. The key to the odd behavior is a small delta (300 UNITS = 30 microseconds) between when one frame ends and the next begins. For some reason the capture hardware I am using reports a different framerate than what it actually exhibits when it provides captured frames and their timestamps. I've updated the source below to give an example of how to imitate this behavior.

I wrote a simple "fake" image source filter for directshow, deriving from CSource. It works well. But I've noticed something odd that I can't explain. My FillBuffer looks like:

const REFERENCE_TIME TIME_PER_FRAME = 166000;

HRESULT MyFilterOutputPin::FillBuffer(IMediaSample *pms)
{
    //fill the bytes of the image media sample
    static REFERENCE_TIME currentTime = 0;
    REFERENCE_TIME startTime = currentTime;
    REFERENCE_TIME endTime = currentTime + TIME_PER_FRAME; //60Hz video
    // The +300 below is an update not in the original question, and is the
    // key to reproducing the behavior.
    currentTime += TIME_PER_FRAME + 300;
    pms->SetTime(&startTime, &endTime);
    pms->SetMediaTime(&startTime, &endTime);
    return S_OK;
}

and my CMediaType is set by calling

SetCMediaTypeForBitmap(1920,1080,TIME_PER_FRAME,&cmt);

where that function is implemented as

void SetCMediaTypeForBitmap(unsigned long width, unsigned long height, REFERENCE_TIME averageTimePerFrame, CMediaType *pmt)
{
    CMediaType mt;
    mt.SetType(&MEDIATYPE_Video);
    mt.SetSubtype(&MEDIASUBTYPE_RGB24);
    mt.SetFormatType(&FORMAT_VideoInfo);
    mt.SetSampleSize(GetBitmapBufferSize(width, height, BIT_COUNT));
    auto pvi = (VIDEOINFOHEADER*)mt.AllocFormatBuffer(sizeof(VIDEOINFOHEADER));
    pvi->rcSource.left = pvi->rcSource.top = 0;
    pvi->rcSource.right = width;
    pvi->rcSource.bottom = height;
    pvi->rcTarget = pvi->rcSource;
    pvi->dwBitErrorRate = 0;
    pvi->AvgTimePerFrame = averageTimePerFrame;
    pvi->bmiHeader.biSize = 40;
    pvi->bmiHeader.biWidth = width;
    pvi->bmiHeader.biHeight = height;
    pvi->bmiHeader.biPlanes = 1;
    pvi->bmiHeader.biBitCount = BIT_COUNT;
    pvi->bmiHeader.biCompression = 0;
    pvi->bmiHeader.biSizeImage = mt.lSampleSize;
    pvi->dwBitRate = (DWORD)(((uint64_t)mt.lSampleSize) * 8 / pvi->AvgTimePerFrame * UNITS);
    pvi->bmiHeader.biXPelsPerMeter = pvi->bmiHeader.biYPelsPerMeter = pvi->bmiHeader.biClrUsed = pvi->bmiHeader.biClrImportant = 0;
    *pmt = mt;
}

If I try to set the media time on my samples in my override of MyFilterOutputPin::FillBuffer and then write the output to an AVI file, the AVI file will, according to VirtualDub, have 300x the number of frames it should. It lists most frames as dropped and periodically has a real frame.

If I simply remove the SetMediaTime, the output AVI is completely normal.

I've experimented with different ways to set the media time. I can put times relative to the filter's m_pStart, times on the reference clock, etc. It doesn't seem to matter - just the presence of a MediaTime blows the AVI up.

I've seen proper directshow capture filters that set MediaTime just fine, so I'm guessing that I'm failing to do something. Any thoughts/ideas?

Here's a screenshot of my file properties for about 2 seconds of capture. 138 frames were truly output, but the AVI believes it has ~40000 frames, or 290 times the true number. If I run the same code without SetMediaTime, the AVI is 2 seconds long with 138 frames and. no "dropped" frames. Wonky AVI properties

The non-dropped frames are at 0, 326, 552, 878, 1104, 1430, 1756, 1982. The deltas between those are 326, 226, 326, 226, 226, 326, 326, 226. It's definitely got me scratching my head...


Solution

  • I stumbled onto this bit of documentation today and I think it actually explains things to some degree. From it,

    Optionally, the filter can also specify a media time for the sample. In a video stream, media time represents the frame number.

    So the mux expects mediatimes, if present, like 0-1,1-2,2-3. When media times are set to contiguous chunks, like 0-100000,100000-200000, I'm guessing the mux copes. But when there are gaps, based on the documentation Microsoft provides, I can understand how things fall apart.

    But knowing this is actually pretty powerful. Since AVI files are a constant framerate format, you can use mediatimes to communicate frame drops when you need them. I've started using them successfully for this purpose.

    FYI, I tried including mediatimes based on actual times in a project again a few days ago and instead of funny results, the directshow graph would just stop with E_FAIL.

    tl;dr Only use media times to communicate frame numbers, at least to the AVI mux.