Search code examples
macosvideoavfoundationmetadata

Add a chapter track while creating a video with AVFoundation


I'm creating a video (QuickTime .mov format, H.264 encoded) from a bunch of still images, and I want to add a chapter track in the process. The video is being created fine, and I am not detecting any errors, but QuickTime Player does not show any chapters. I am aware of this question but it does not solve my problem.

The old QuickTime Player 7, unlike recent versions, can show information about the tracks of a movie. When I open a movie with working chapters (created using old QuickTime code), I see a video track and a text track, and the video track knows that the text track is providing chapters for the video. Whereas, if I examine a movie created by my new code, there is a metadata track along with the video track, but QuickTime does not know that the metadata track is supposed to be providing chapters. Things I've read have led me to believe that one is supposed to use metadata for chapters, but has anyone actually gotten that to work? Would a text track work?

Here's how I am creating the AVAssetWriterInput for the metadata.

// Make dummy AVMetadataItem to get its format
AVMutableMetadataItem* dummyMetaItem = [AVMutableMetadataItem metadataItem];
dummyMetaItem.identifier = AVMetadataIdentifierQuickTimeUserDataChapter;
dummyMetaItem.dataType = (NSString*) kCMMetadataBaseDataType_UTF8;
dummyMetaItem.value = @"foo";
AVTimedMetadataGroup* dummyGroup = [[[AVTimedMetadataGroup alloc]
    initWithItems: @[dummyMetaItem]
    timeRange: CMTimeRangeMake( kCMTimeZero, kCMTimeInvalid )] autorelease];
CMMetadataFormatDescriptionRef metaFmt = [dummyGroup copyFormatDescription];

// Make the input
AVAssetWriterInput* metaWriterInput = [AVAssetWriterInput
    assetWriterInputWithMediaType: AVMediaTypeMetadata
    outputSettings: nil
    sourceFormatHint: metaFmt];
CFRelease( metaFmt );

// Associate metadata input with video input
[videoInput addTrackAssociationWithTrackOfInput: metaWriterInput
    type: AVTrackAssociationTypeChapterList];

// Associate metadata input with AVAssetWriter
[writer addInput: metaWriterInput];

// Create a metadata adaptor
AVAssetWriterInputMetadataAdaptor* metaAdaptor = [AVAssetWriterInputMetadataAdaptor
    assetWriterInputMetadataAdaptorWithAssetWriterInput: metaWriterInput];

P.S. I tried using a text track instead (an AVAssetWriterInput of type AVMediaTypeText) and QuickTime Player says the result is "not a movie". Not sure what I'm doing wrong.


Solution

  • I managed to use a text track to provide chapters. I spent an Apple developer tech support incident and was told that this is the right way to do it.

    Setup:

    I assume that the AVAssetWriter has been created, and an AVAssetWriterInput for the video track has been assigned to it.

    The trickiest part here is creating the text format description. The docs say that CMTextFormatDescriptionCreateFromBigEndianTextDescriptionData takes as input a TextDescription structure, but neglects to say where that structure is defined. It is in Movies.h, which is in QuickTime.framework, which is no longer part of the Mac OS SDK. Thanks, Apple.

    // Create AVAssetWriterInput
    AVAssetWriterInput* textWriterInput = [AVAssetWriterInput
        assetWriterInputWithMediaType: AVMediaTypeText
        outputSettings: nil ];
    textWriterInput.marksOutputTrackAsEnabled = NO;
    
    // Connect input to writer
    [writer addInput: textWriterInput];
    
    // Mark the text track as providing chapter for the video
    [videoWriterInput addTrackAssociationWithTrackOfInput: textWriterInput
        type: AVTrackAssociationTypeChapterList];
    
    // Create the text format description, which we will need
    // when creating each sample.
    CMFormatDescriptionRef textFmt = NULL;
    TextDescription textDesc;
    memset( &textDesc, 0, sizeof(textDesc) );
    textDesc.descSize = OSSwapHostToBigInt32( sizeof(textDesc) );
    textDesc.dataFormat = OSSwapHostToBigInt32( 'text' );
    CMTextFormatDescriptionCreateFromBigEndianTextDescriptionData( NULL,
        (const uint8_t*)&textDesc, sizeof(textDesc), NULL, kCMMediaType_Text,
        &textFmt );
    

    Writing a Sample:

    CMSampleTimingInfo timing =
    {
        CMTimeMakeWithSeconds( endTime - startTime, timeScale ),    // duration
        CMTimeMakeWithSeconds( startTime, timeScale ),
        kCMTimeInvalid
    };
    CMSampleBufferRef textSample = NULL;
    CMPSampleBufferCreateWithText( NULL, (CFStringRef)theTitle, true, NULL, NULL,
        textFmt, &timing, &textSample );
    [textWriterInput appendSampleBuffer: textSample];
    

    The function CMPSampleBufferCreateWithText is taken from the open source CoreMediaPlus.