Search code examples
iosaudioavfoundationfile-conversionavassetwriter

ios - Convert video's audio to AAC


I'm trying to encode any audio format to AAC format, with 44100Hz sample rate.

So basically : input (mp3, aac? etc, any sample rate) -> AAC (44100Hz)

The source audio comes from a video (mp4), but I can extract it to m4a (AAC). The thing is I also want to change the sample rate to 44100Hz.

I'm trying to achieve this with AVAssetReader and AVAssetWriter, but not sure if its possible or if it's the best solution. Any other solution would be very much appreciated !

Here's my code so far :

    // Input video audio (.mp4)
    AVAsset *videoAsset = <mp4 video asset>;
    NSArray<AVAssetTrack *> *videoAudioTracks = [videoAsset tracksWithMediaType:AVMediaTypeAudio];
    AVAssetTrack *videoAudioTrack = [videoAudioTracks objectAtIndex:0];

    // Output audio (.m4a AAC)
    NSURL *exportUrl = <m4a, aac output file URL>;

    // ASSET READER
    NSError *error;
    AVAssetReader *assetReader = [AVAssetReader assetReaderWithAsset:videoAsset
                                                               error:&error];
    if(error) {
        NSLog(@"error:%@",error);
        return;
    }

    // Asset reader output
    AVAssetReaderOutput *assetReaderOutput =[AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:videoAudioTrack
                                                                                       outputSettings:nil];
    if(![assetReader canAddOutput:assetReaderOutput]) {
        NSLog(@"Can't add output!");
        return;
    }

    [assetReader addOutput:assetReaderOutput];

    // ASSET WRITER
    AVAssetWriter *assetWriter = [AVAssetWriter assetWriterWithURL:exportUrl
                                                          fileType:AVFileTypeAppleM4A
                                                             error:&error];
    if(error) {
        NSLog(@"error:%@",error);
        return;
    }

    AudioChannelLayout channelLayout;
    memset(&channelLayout, 0, sizeof(AudioChannelLayout));
    channelLayout.mChannelLayoutTag = kAudioChannelLayoutTag_Stereo;

    NSDictionary *outputSettings = @{AVFormatIDKey: @(kAudioFormatMPEG4AAC),
            AVNumberOfChannelsKey: @2,
            AVSampleRateKey: @44100.0F,
            AVChannelLayoutKey: [NSData dataWithBytes:&channelLayout length:sizeof(AudioChannelLayout)],
            AVEncoderBitRateKey: @64000};

    /*NSDictionary *outputSettings = [NSDictionary dictionaryWithObjectsAndKeys:
                                    [NSNumber numberWithInt:kAudioFormatLinearPCM], AVFormatIDKey,
                                    [NSNumber numberWithFloat:44100.f], AVSampleRateKey,
                                    [NSNumber numberWithInt:2], AVNumberOfChannelsKey,
                                    [NSData dataWithBytes:&channelLayout length:sizeof(AudioChannelLayout)], AVChannelLayoutKey,
                                    [NSNumber numberWithInt:16], AVLinearPCMBitDepthKey,
                                    [NSNumber numberWithBool:NO], AVLinearPCMIsNonInterleaved,
                                    [NSNumber numberWithBool:NO],AVLinearPCMIsFloatKey,
                                    [NSNumber numberWithBool:NO], AVLinearPCMIsBigEndianKey,
                                    nil];*/

    // Asset writer input
    AVAssetWriterInput *assetWriterInput = [AVAssetWriterInput assetWriterInputWithMediaType:AVMediaTypeAudio
                                                                              outputSettings:outputSettings];
    if ([assetWriter canAddInput:assetWriterInput])
        [assetWriter addInput:assetWriterInput];
    else {
        NSLog(@"can't add asset writer input... die!");
        return;
    }

    assetWriterInput.expectsMediaDataInRealTime = NO;

    [assetWriter startWriting];
    [assetReader startReading];

    CMTime startTime = CMTimeMake (0, videoAudioTrack.naturalTimeScale);
    [assetWriter startSessionAtSourceTime: startTime];

    __block UInt64 convertedByteCount = 0;
    dispatch_queue_t mediaInputQueue = dispatch_queue_create("mediaInputQueue", NULL);

    [assetWriterInput requestMediaDataWhenReadyOnQueue:mediaInputQueue
                                            usingBlock: ^
                                            {
                                                while (assetWriterInput.readyForMoreMediaData)
                                                {
                                                    CMSampleBufferRef nextBuffer = [assetReaderOutput copyNextSampleBuffer];
                                                    if (nextBuffer)
                                                    {
                                                        // append buffer
                                                        [assetWriterInput appendSampleBuffer: nextBuffer];
                                                        convertedByteCount += CMSampleBufferGetTotalSampleSize (nextBuffer);

                                                        CMSampleBufferInvalidate(nextBuffer);
                                                        CFRelease(nextBuffer);
                                                        nextBuffer = NULL;
                                                    }
                                                    else
                                                    {
                                                        [assetWriterInput markAsFinished];
                                                        //              [assetWriter finishWriting];
                                                        [assetReader cancelReading];

                                                        break;
                                                    }
                                                }
                                            }]; 

And here is the error I get with a video that contains an mp3 audio track :

Terminating app due to uncaught exception 
'NSInvalidArgumentException', reason: '*** -[AVAssetWriterInput 
appendSampleBuffer:] Cannot append sample buffer: Input buffer must 
be in an uncompressed format when outputSettings is not nil'

Any help would be much appreciated, thanks !


Solution

  • You should be able to achieve this by configuring your AVAssetReaderOutput output settings:

    NSDictionary *readerOutputSettings = @{ AVSampleRateKey: @44100, AVFormatIDKey: @(kAudioFormatLinearPCM) };
    
    AVAssetReaderOutput *assetReaderOutput =[AVAssetReaderTrackOutput assetReaderTrackOutputWithTrack:videoAudioTrack
                                                                                       outputSettings:readerOutputSettings];