Search code examples
audiosignal-processingweb-audio-api

What is an audio block?


I've seen some sentences including a word 'block'.

Pd lightens its workload by working with samples in blocks rather than individually. This greatly improves performance. The standard block size is 64 samples, but this setting can be changed. (http://pd-tutorial.com/english/ch03.html)

Rendering an audio graph is done in blocks of 128 samples-frames. A block of 128 samples-frames is called a render quantum, and the render quantum size is 128. (https://www.w3.org/TR/webaudio/#rendering-loop)

So, what I wonder are:

(1) What is wrong with handling samples individually? Why audio samples are grouped by a block of a some size (64, 128) ?

(2) Why the block size is a power of 2? // 2^6 = 64, 2^7 = 128

(3) After grouped, to where the samples go? Are they then played by a sound card or something?


Solution

    1. Go to the cashier, purchase a french fry, eat it, and repeat until you're full. That's the problem with processing samples one at a time -- there's a significant fixed cost to processing each bunch of samples, that doesn't depend on how big each bunch is. Bigger bunches => fewer bunches => less cost. (the tradeoff is delay)

    2. In computer systems, most block sizes are powers of 2 no matter what they are blocks of or what they're for. Historically that has been because it was more efficient to do math on powers of 2 using bit shifts instead of divisions. Especially for audio, though, there are transforms like the FFT that are commonly used in processing that are only conveniently implemented on blocks with power-of-2 sizes.

    3. Lots of stuff can happen after you process a block. Eventually, if the sound is being played, each block will be sent to the audio device driver, which will somehow arrange for it to be streamed out to the speakers or headphones one sample at a time.