Search code examples
android-ndkffmpegsamplepcmlibav

How to convert sample rate from AV_SAMPLE_FMT_FLTP to AV_SAMPLE_FMT_S16?


I am decoding aac to pcm with ffmpeg with avcodec_decode_audio3. However it decodes into AV_SAMPLE_FMT_FLTP sample format (PCM 32bit Float Planar) and i need AV_SAMPLE_FMT_S16 (PCM 16 bit signed - S16LE).

I know that ffmpeg can do this easily with -sample_fmt. I want to do the same with the code but i still couldn't figure it out.

audio_resample did not work for: it fails with error message: .... conversion failed.


Solution

  • EDIT 9th April 2013: Worked out how to use libswresample to do this... much faster!

    At some point in the last 2-3 years FFmpeg's AAC decoder's output format changed from AV_SAMPLE_FMT_S16 to AV_SAMPLE_FMT_FLTP. This means that each audio channel has it's own buffer, and each sample value is a 32-bit floating point value scaled from -1.0 to +1.0.

    Whereas with AV_SAMPLE_FMT_S16 the data is in a single buffer, with the samples interleaved, and each sample is a signed integer from -32767 to +32767.

    And if you really need your audio as AV_SAMPLE_FMT_S16, then you have to do the conversion yourself. I figured out two ways to do it:

    1. Use libswresample (recommended)

    #include "libswresample/swresample.h"
    
    ...
    
    SwrContext *swr;
    
    ...
    
    // Set up SWR context once you've got codec information
    swr = swr_alloc();
    av_opt_set_int(swr, "in_channel_layout",  audioCodec->channel_layout, 0);
    av_opt_set_int(swr, "out_channel_layout", audioCodec->channel_layout,  0);
    av_opt_set_int(swr, "in_sample_rate",     audioCodec->sample_rate, 0);
    av_opt_set_int(swr, "out_sample_rate",    audioCodec->sample_rate, 0);
    av_opt_set_sample_fmt(swr, "in_sample_fmt",  AV_SAMPLE_FMT_FLTP, 0);
    av_opt_set_sample_fmt(swr, "out_sample_fmt", AV_SAMPLE_FMT_S16,  0);
    swr_init(swr);
    
    ...
    
    // In your decoder loop, after decoding an audio frame:
    AVFrame *audioFrame = ...;
    int16_t* outputBuffer = ...;
    swr_convert(&outputBuffer, audioFrame->nb_samples, audioFrame->extended_data, audioFrame->nb_samples);   
    

    And that's all you have to do!

    2. Do it by hand in C (original answer, not recommended)

    So in your decode loop, when you've got an audio packet you decode it like this:

    AVCodecContext *audioCodec;   // init'd elsewhere
    AVFrame *audioFrame;          // init'd elsewhere
    AVPacket packet;              // init'd elsewhere
    int16_t* outputBuffer;        // init'd elsewhere
    int out_size = 0;
    ...
    int len = avcodec_decode_audio4(audioCodec, audioFrame, &out_size, &packet);
    

    And then, if you've got a full frame of audio, you can convert it fairly easily:

        // Convert from AV_SAMPLE_FMT_FLTP to AV_SAMPLE_FMT_S16
        int in_samples = audioFrame->nb_samples;
        int in_linesize = audioFrame->linesize[0];
        int i=0;
        float* inputChannel0 = (float*)audioFrame->extended_data[0];
        // Mono
        if (audioFrame->channels==1) {
            for (i=0 ; i<in_samples ; i++) {
                float sample = *inputChannel0++;
                if (sample<-1.0f) sample=-1.0f; else if (sample>1.0f) sample=1.0f;
                outputBuffer[i] = (int16_t) (sample * 32767.0f);
            }
        }
        // Stereo
        else {
            float* inputChannel1 = (float*)audioFrame->extended_data[1];
            for (i=0 ; i<in_samples ; i++) {
                 outputBuffer[i*2] = (int16_t) ((*inputChannel0++) * 32767.0f);
                 outputBuffer[i*2+1] = (int16_t) ((*inputChannel1++) * 32767.0f);
            }
        }
        // outputBuffer now contains 16-bit PCM!
    

    I've left a couple of things out for clarity... the clamping in the mono path should ideally be duplicated in the stereo path. And the code can be easily optimized.