Search code examples
iosswiftgrand-central-dispatchaudiounitopus

Processing data with Audio Unit recording callback [iOS][Swift]


I am creating a cross platform VOIP application which uses UDP to send and receive data. I am using audio units for the real time recording and playback. The communication is fast and smooth when working with raw data but when I involve a codec like OPUS, the data which is being encoded and sent from iPhone to Android has clicking and popping sounds in between. I have been pulling my hair out trying to solve this issue.

The encoded data which is coming from Android to iPhone plays perfectly and there are no issues with that. I am using TPCircularBuffer to handle the data when recording and playback.

This is what I have so far in the recording callback:

var samplesForEncoder: UInt32 = 640
var targetBuffer = [opus_int16](repeating: 0, count: 1500)

    _ = TPCircularBufferProduceBytes(&circularBuffer, mData, inNumberFrames * 2)
    self.samplesSinceLastCall += inNumberFrames

    encodingQueue.async {
        if self.samplesSinceLastCall > self.samplesForEncoder {
            let samplesToCopy = min(self.bytesToCopy, Int(self.availableBytes))
            self.bufferTailPointer = TPCircularBufferTail(&self.circularBuffer, &self.availableBytes)
            memcpy(&self.targetBuffer, self.bufferTailPointer, samplesToCopy)
            self.semaphore.signal()
            self.semaphore.wait()

            self.opusHelper?.encodeStream(of: self.targetBuffer)
            self.semaphore.signal()
            self.semaphore.wait()

            TPCircularBufferConsume(&self.circularBuffer, UInt32(samplesToCopy))
            self.samplesSinceLastCall = 0
            self.semaphore.signal()
            self.semaphore.wait()
        }
    }

This is the encoding function:

var encodedData = [UInt8](repeating: 0, count: 1500)

    self.encodedLength = opus_encode(self.encoder!, samples, OpusSettings.FRAME_SIZE, &self.encodedData, 1500)

        let opusSlice = Array(self.encodedData.prefix(Int(self.encodedLength!)))

        self.seqNumber += 1
        self.protoModel.sequenceNumber = self.seqNumber
        self.protoModel.timeStamp = Date().currentTimeInMillis()
        self.protoModel.payload = opusSlice.data

        do {
            _ = try self.udpClient?.send(data: self.protoModel)
        } catch {
            print(error.localizedDescription)
        }

I have tried to handle the heavy processing inside another thread by using DispatchGroups, DispatchSourceTimers, DispatchSemaphores, DispatchQueues but I just cannot get the result that I need. Can anyone help?

Can anyone guide me how to make the encoding independent of the real time audio thread, I tried to create a polling thread but even that did not work. I need assistance on transferring data between 2 threads with different data size requirements. I am receiving 341-342 bytes from the mic but I need to send 640 bytes to the encoder hence I am combining 2 samples and reusing the left over bytes for later.

@hotpaw2 recommends this https://stackoverflow.com/a/58947295/12020007 but I just need a little more guidance.

Updated code as per @hotpaw2's answer:

Recording callback:

_ = TPCircularBufferProduceBytes(&circularBuffer, mData, inNumberFrames * 2)
        self.samplesSinceLastCall += inNumberFrames

        if !shouldStartSending {
            startLooping()
        }

Updated polling thread:

    func startLooping() {
        loopingQueue.async {
            repeat {
                if self.samplesSinceLastCall > self.samplesForEncoder {
                    let samplesToCopy = min(self.bytesToCopy, Int(self.availableBytes))
                    self.bufferTailPointer = TPCircularBufferTail(&self.circularBuffer, &self.availableBytes)
                    memcpy(&self.targetBuffer, self.bufferTailPointer, samplesToCopy)
                    self.semaphore.signal()
                    self.semaphore.wait()

                    self.opusEncodedStream = self.opusHelper?.encodeStream(of: self.targetBuffer)
                    self.semaphore.signal()
                    self.semaphore.wait()

                    self.send(stream: self.opusEncodedStream!)
                    self.semaphore.signal()
                    self.semaphore.wait()

                    TPCircularBufferConsume(&self.circularBuffer, UInt32(samplesToCopy))
                    self.samplesSinceLastCall = 0
                }
                self.shouldStartSending = true
            } while true
        }
}

Solution

  • Apple recommends against using semaphores or calling Swift methods (such as encoders) inside any real-time Audio Unit callback. Just copy the data into a pre-allocated circular buffer inside the audio unit callback. Period. Do everything else outside the callback. Semaphores and signals included.

    So, you need to create a polling thread.

    Do everything inside a polling loop, timer callback, or network ready callback. Do your work anytime there is enough data in the FIFO. Call (poll) often enough (high enough polling frequency or timer callback rate) that you do not lose data. Handle all the data you can (perhaps multiple buffers at a time, if available) in each iteration of the polling loop.

    You may need to pre-fill the circular buffer a bit (perhaps a few multiples of your 640 UDP frame size) before starting to send, to account for network and timer jitter.