Search code examples
c++ffmpegdecodeaudio-streamingaac

How to decode AAC network audio stream using ffmpeg


I implemented a network video player (like VLC) using ffmpeg. But it can not decode AAC audio stream received from a IP camera. It can decode other audio sterams like G711, G726 etc. I set the codec ID as AV_CODEC_ID_AAC and I set channels and sample rate of AvCodecContext. But avcodec_decode_audio4 fails with an error code of INVALID_DATA. I checked previously asked questions, I tried to add extrabytes to AvCodecContext using media format specific parameters of "config=1408". And I set extradatabytes as 2 bytes of "20" and "8" but it also not worked. I appreciate any help, thanks.

IP CAMERA SDP:
a=rtpmap:96 mpeg4-generic/16000/1
a=fmtp:96 streamtype=5; profile-level-id=5; mode=AAC-hbr; config=1408; SizeLength=13; IndexLength=3; IndexDeltaLength=3 
AVCodec* decoder = avcodec_find_decoder((::AVCodecID)id);//set as AV_CODEC_ID_AAC

AVCodecContext* decoderContext = avcodec_alloc_context3(decoder);   

char* test = (char*)System::Runtime::InteropServices::Marshal::StringToHGlobalAnsi("1408").ToPointer();
unsigned int length;
uint8_t* extradata = parseGeneralConfigStr(test, length);//it is set as 0x14 and 0x08

decoderContext->channels = number_of_channels; //set as 1
decoderContext->sample_rate = sample_rate; //set as 16000
decoderContext->channel_layout = AV_CH_LAYOUT_MONO;
decoderContext->codec_type = AVMEDIA_TYPE_AUDIO;

decoderContext->extradata = (uint8_t*)av_malloc(AV_INPUT_BUFFER_PADDING_SIZE + length);
memcpy(decoderContext->extradata, extradata, length);
memset(decoderContext->extradata+ length, 0, AV_INPUT_BUFFER_PADDING_SIZE);

Solution

  • Did you check data for INVALID_DATA?
    You can check it according to RFC

    RFC3640 (3.2 RTP Payload Structure)

    AAC Payload can be seperated like below
    AU-Header | Size Info | ADTS | Data

    Example payload 00 10 0c 00 ff f1 60 40 30 01 7c 01 30 35 ac

    According to configs that u shared
    AU-size (SizeLength=13)
    AU-Index / AU-Index-delta (IndexLength=3/IndexDeltaLength=3)

    The length in bits of AU-Header is 13(SizeLength) + 3(IndexLength/IndexDeltaLength) = 16.
    AU-Header 00 10

    You should use AU-size(SizeLength) value for Size Info

    AU-size: Indicates the size in octets of the associated Access Unit in the Access Unit Data Section in the same RTP packet.

    First 13 (SizeLength) bits 0000000000010 equals to 2. So read 2 octets for size info.
    Size Info 0c 00

    ADTS ff f1 60 40 30 01 7c
    ADTS Parser

    ID MPEG-4
    MPEG Layer 0
    CRC checksum absent 1
    Profile Low Complexity profile (AAC LC)
    Sampling frequency 16000
    Private bit 0 Channel configuration 1
    Original/copy 0
    Home 0
    Copyright identification bit 0
    Copyright identification start 0
    AAC frame length 384
    ADTS buffer fullness 95
    No raw data blocks in frame 0

    Data starts with 01 30 35 ac.