Search code examples
audiowaveform24-bit

Working with 24-bit audio samples


What is the "standard way" of working with 24-bit audio? Well, there are no 24-bit data types available, really. Here are the methods that come into my mind:

  1. Represent 24-bit audio samples as 32-bit ints and ignore the upper eight bits.
  2. Just like (1) but ignore the lower eight bits.
  3. Represent 24-bit audio samples as 32-bit floats.
  4. Represent the samples as structs of 3 bytes (acceptable for C/C++, but bad for Java).

How do you work this out?


Solution

  • Store them them as 32- or 64-bit signed ints or float or double unless you are space conscious and care about packing them into the smallest space possible.

    Audio samples often appear as 24-bits to and from audio hardware since this is commonly the resolution of the DACs and ADCs - although on most computer hardware, don't be surprised to find the bottom 3 of 4 bits banging away randomly with noise.

    Digital signal processing operations - which is what usually happens downstream from the acquisition of samples - all involve addition of weighted sums of samples. A sample stored in an integer type can be considered to be fixed-point binary with an implied binary point at some arbitrary point - the position of which you can chose strategically to maintain as many bits of precision as possible.

    For instance, the sum of two 24-bit integer yields a result of 25 bits. After 8 such additions, the 32-bit type would overflow and you would need to re-normalize by rounding and shifting right.

    Therefore, if you're using integer types to store your samples, use the largest you can and start with the samples in the least significant 24 bits.

    Floating point types of course take care of this detail for you, although you get less choice about when renormalisation takes place. They are the usual choice for audio processing where hardware support is available. A single precision float has a 24-bit mantissa, so can hold a 24-bit sample without loss of precision.

    Usually floating point samples are stored in the range -1.0f < x < 1.0f.