Search code examples
iosopengl-escore-graphicscore-audioavaudiorecorder

iOS record audio and draw waveform like Voice Memos


I'm going to ask this at the risk of being too vague or asking too many things in one question, but I'm really just looking for a point in the right direction.

In my app I want to record audio, show a waveform while recording, and scroll through the waveform to record and playback from a specified time. For example, if I have 3 minutes of audio, I should be able to scroll back to 2:00 and start recording from there to fix a mistake.

In Voice Memos, this is accomplished instantaneously, without any delay or loading time. I'm trying to figure out how the did this, if anyone has a clue.

What I've tried:

EZAudio - This library is great, but doesn't do what I want. You can't scroll through the waveform. It deletes the waveform data at the beginning and begins appending it to the end once it reaches a certain length.

SCWaveformView - This waveform is nice, but it uses images. Once the waveform is too long, putting it in a scroll view causes really jittery scrolling. Also you can't build the waveform while recording, only afterward.

As far as appending, I've used this method: https://stackoverflow.com/a/11520553/1391672 But there is significant processing time, even when appending two very short clips of audio together (in my experience).

How does Voice Memos do what it does? Do you think the waveform is drawn in OpenGL or CoreGraphics? Are they using Core Audio or AVAudioRecorder? Has anyone built anything like this that can point me in the right direction?


Solution

  • When zoomed-in, a scrollview only needs to draw the small portion of the waveform that is visible. When zoomed-out, a graph view might only drawn every Nth point of the audio buffer, or use some other DSP down-sampling algorithm on the data before rendering. This likely has to be done using your own custom drawing or graphics rendering code inside a UIScrollView or similar custom controller. The waveform rendering code during and after recording don't have to be the same.

    The recording API and the drawing API you use can be completely independent, and can be almost anything, from OpenGL to Metal to Core Graphics (on newer faster devices). On the audio end, Core Audio will help provide the lowest latency, but Audio Queues and the AVAudioEngine might also be suitable.