Search code examples
iosavfoundationavassetexportsessionavkit

AVAssetExportSession: trackIDs don't persist


I am adding a voiceover track to an existing video, which may include audio tracks. I'm using AVAssetExportSession to render the video to a .mov.

I'd like to keep track of which audio track is the voiceover, so if the user later wants to edit the voiceover, I know which track to update. I thought I would use a unique trackID.

So I call:

let compositionAudioTrack = composition.addMutableTrack(withMediaType: .audio, preferredTrackID: myAudioTrackID)!

where myAudioTrackID is a random Int32 I generated. I have confirmed that the trackID for the returned compositionAudioTrack is indeed = myAudioTrackID.

But when I render the video usign AVAssetExportSession, the trackID is reassigned to some other number.

How can I keep track of which audio track is mine?


Solution

  • It's true that AVAssetExportSession discards the trackIDs you chose.

    But private use IETF BCP 47 (RFC 4646) language tags do persist, at least in mov and mp4 formats, and allow you to easily embed your own identifier in the track. You could use them to identify your special track.

    let inputAudioTrack1Url = URL(fileURLWithPath: "/path/to/someaudiofile1.m4a")
    let inputAudioTrack2Url = URL(fileURLWithPath: "/path/to/someaudiofile2.m4a")
    let outputUrl = URL(fileURLWithPath: "/path/to/output.mov")
    
    func doExport() async throws {
        let composition = AVMutableComposition()
        
        let asset1 = AVURLAsset(url: inputAudioTrack1Url)
        let track1 = try await asset1.loadTracks(withMediaType: .audio).first!
        
        let asset2 = AVURLAsset(url: inputAudioTrack2Url)
        let track2 = try await asset2.loadTracks(withMediaType: .audio).first!
    
        // these IDs do not end up in the exported file
        let compositionTrack1 = composition.addMutableTrack(withMediaType: .audio, preferredTrackID: 10)!
        let compositionTrack2 = composition.addMutableTrack(withMediaType: .audio, preferredTrackID: 11)!
    
        // https://datatracker.ietf.org/doc/html/rfc4646
        // use "private use" tags language tags to identify tracks
        compositionTrack1.extendedLanguageTag = "x-my-tag-\(123)"
        compositionTrack2.extendedLanguageTag = "x-my-tag-\(456)"
        
        let range = CMTimeRange(start: .zero, end: CMTimeMake(value: 3, timescale: 1))
        try compositionTrack1.insertTimeRange(range, of: track1, at: .zero)
        try compositionTrack2.insertTimeRange(range, of: track2, at: .zero)
        
        try? FileManager.default.removeItem(at: outputUrl)
        
        let session = AVAssetExportSession(asset: composition, presetName: AVAssetExportPresetHighestQuality)!
        session.outputURL = outputUrl
        session.outputFileType = .mov
        
        await session.export()
        print("finished: \(session.status.rawValue), \(String(describing: session.error))")
    
        try await self.doImport()
    }
    
    // prints
    // track id 1 tag Optional("x-my-tag-123")
    // track id 2 tag Optional("x-my-tag-456")
    func doImport() async throws {
        let asset = AVURLAsset(url: outputUrl)
        let tracks = try await asset.loadTracks(withMediaType: .audio)
        
        for track in tracks {
            let extendedLanguageTag = try await track.load(.extendedLanguageTag)
            print("track id \(track.trackID) tag \(String(describing: extendedLanguageTag))")
       }
    }