Search code examples
iosobjective-caudioaudio-streamingaudiounit

Playing audio stream using audio unit, crackling sound occurs


In iOS application,

Playing audio stream from BLE, using TPCircularBuffer and Audio Unit.

Sound plays well, but when buffer is empty and there are no bytes to play, it is causing crackling sound for instances.

Here is my audio stream configuration,

  AudioStreamBasicDescription audioFormat;
        audioFormat.mSampleRate         = 8000.00;
        audioFormat.mFormatID           = kAudioFormatLinearPCM;
        audioFormat.mFormatFlags        = kAudioFormatFlagsNativeFloatPacked | kAudioFormatFlagIsPacked;
        // || kAudioFormatFlagsNativeFloatPacked || kAudioFormatFlagIsSignedInteger
        //kAudioFormatFlagIsSignedInteger
        audioFormat.mFramesPerPacket    = 1;
        audioFormat.mChannelsPerFrame   = 1;
        audioFormat.mBitsPerChannel     = 32;// Update when required
        audioFormat.mBytesPerPacket     = 4; // Update when required
        audioFormat.mBytesPerFrame      = 4; // Update when required
    

Below is a Playback function of Audio Unit

static OSStatus playbackCallback(void *inRefCon, 
                                 AudioUnitRenderActionFlags *ioActionFlags, 
                             const AudioTimeStamp *inTimeStamp, 
                             UInt32 inBusNumber, 
                             UInt32 inNumberFrames, 
                             AudioBufferList *ioData) {

//1
for (int i=0; i < ioData->mNumberBuffers; i++) { 

    IosAudioController *THIS = (__bridge IosAudioController *)inRefCon;

    int bytesAskingByPlayback = ioData->mBuffers[i].mDataByteSize;

    SInt16 *targetBuffer = (SInt16*)ioData->mBuffers[i].mData;

    // Pull audio from playthrough buffer
    int32_t availableBytesFromBuffer;

    SInt16 *sBuffer = TPCircularBufferTail(iosAudio.addressOfTPBuffer, &availableBytesFromBuffer);
    
    int willRemainBytes = availableBytesFromBuffer - bytesAskingByPlayback;

    if (willRemainBytes > 0) {
        memcpy(targetBuffer, sBuffer, bytesAskingByPlayback);
        TPCircularBufferConsume(iosAudio.addressOfTPBuffer,bytesAskingByPlayback);
        } else {


        //Note: Mostly need to update code here

                memcpy(targetBuffer, sBuffer, availableBytesFromBuffer);

            TPCircularBufferConsume(iosAudio.addressOfTPBuffer, availableBytesFromBuffer);

    }
}
    return noErr;
}

Buffer size is 16384

Some solution said that i would fill the target buffer with 0s to silence but it is not working.

Some Solution says that i could fill the target buffer with previous values to fill the gap.


Solution

  • but when buffer is empty and there are no bytes to play.

    You need to start here and determine why this is happening. Ideally this should never happen. If it does happen, it should be rare and for a clear reason. If the upstream has paused, then you need to pause your downstream. If there is network latency, then you need to increase the size of your buffer.

    Anytime you inject a constant value, whether zero or the last value received, you're going to inject high frequency noise. This will sound like a small pop. If you do this often, then it will crackle. There are techniques for smoothing this, but they're fairly complex, and not something you should reach for before fixing your underlying under-run problem.

    If your upstream has variable latency, then you will need to buffer a bit (10ms, 50ms, 2000ms, it depends on how variable it is) before you start your downstream. If your buffers are drained, you'll need to pause your downstream until you can build your buffer up again.

    Sometimes, it's worth the occasional "pop" to avoid pausing the downstream, and that's when advice like "fill with zeros" or "fill with last value" come in. But if you're getting pops many times a second (and that's what "crackle" usually is), it likely means you're not buffering enough before starting your downstream.

    Depending on the nature of your audio, you also need to consider cases where there is a long pause in your upstream, and then you suddenly get a lot of past data. You have to decide at that point whether to keep it and increase latency, or discard it to get closer to "real-time." The strategies for this completely depend on your use case and is a major part of designing any real-time system.

    Note that TPCircularBuffer and Audio Unit are pretty low-level tools, and put a lot of the work on you. Personally, I prefer building these kinds of systems with a bit higher-level tool like AVSampleBuffer. It's still challenging and you need to understanding real-time systems, but AVFoundation will do a lot more of the work for you. (I personally often have to worry about things like pause, rewind, skip, and the like, so my problem may be enough different than yours that this advice isn't applicable.)