Search code examples
iosaudioencodingaudiotoolboxlame

Encoding 32bit AudioUnitCanonical stereo PCM samples to mp3 with lame


I'm directly converting 32bit stereo samples from a digital line in on iOS devices using Mickey Blue microphone addon to mp3 with lame.

The problem is lame only seems to accept 16bit samples. Trying to directly convert the samples just leaves me with a lot of noise and no recognizable audio. The settings I use to record the audio on iOS are as follows

AudioStreamBasicDescription stereoStreamFormat;
stereoStreamFormat.mSampleRate          = 44100.00;
stereoStreamFormat.mFormatID            = kAudioFormatLinearPCM;
stereoStreamFormat.mFormatFlags         = kAudioFormatFlagsAudioUnitCanonical;
stereoStreamFormat.mBytesPerPacket      = 4;
stereoStreamFormat.mBytesPerFrame       = 4;
stereoStreamFormat.mFramesPerPacket     = 1;
stereoStreamFormat.mChannelsPerFrame    = 2;
stereoStreamFormat.mBitsPerChannel      = 32;//changing this to 16bit does not work

I then convert the samples from 2 individual buffers with lame:

-(void) encodeMicrophoneWithLeft :(NSData*) left andRight : (NSData *) right
{
    void *mp3 = malloc(MP3_SIZE_MIC);
    int write = lame_encode_buffer(microphoneLame, left.bytes, right.bytes, left.length/2, mp3, MP3_SIZE_MIC);
    [self.microphoneBuffer appendBytes:mp3 length:write];
}

With these settings:

lame_set_brate(microphoneLame, 128);// current streaming speed kbps
lame_set_in_samplerate(microphoneLame, 44100);
lame_set_VBR(microphoneLame, vbr_off);//set variable bitrate off
lame_set_num_channels(microphoneLame,2);

Is there any way I can convert the 32bit samples to 16bit or make lame work with 32bit samples? The recorder doesn't work with settings other then 32bit for stereo somehow, but if there's another way to initialise with 16bit that would be a solution for me too.


Solution

  • It seems like kAudioFormatFlagsAudioUnitCanonical is indeed not just 32bit, but a fixed point 24bit sample stored in a floating point notation. The most significant 7 bits can be dropped, only the least significant bit of the first 8 most significant bits functions to sign the value.

           sign    actual sample data
          /       /
    |--8--|------24------|
         |----16----| <- part you need
    

    This means I could bitshift the samples 9 to the right and then directly cast them to shorts, leaving me with 16bit signed samples.

    Like this:

    NSData *left =[[NSData alloc]initWithBytes:tempBufferLeft.mData length:tempBufferLeft.mDataByteSize ];
    NSData *right =[[NSData alloc]initWithBytes:tempBufferRight.mData length:tempBufferRight.mDataByteSize ];
    
    convertedArrayLeft = [[NSMutableData alloc]init];
    convertedArrayRight = [[NSMutableData alloc]init];
    for (int i = 0; i < left.length; i+=4)//steps of 4 bytes
    {
        int tmpValueL, tmpValueR;
        [left getBytes:&tmpValueL range:NSMakeRange(i, 4)];//extract float samples into int for easier manipulation
        [right getBytes:&tmpValueR range:NSMakeRange(i, 4)];
    
        short endValueL = tmpValueL>>9, endValueR = tmpValueR>>9;//bitshift 9 to the right and save as short to lose most significant 16 bits that are now useless
    
        [convertedArrayLeft appendBytes:&endValueL length:sizeof(short)];
        [convertedArrayRight appendBytes:&endValueR length:sizeof(short)];
    }
    

    I couldn't find out why it would not accept an other format in stereo recording, but this works too with a little overhead.