Search code examples
javascriptweb-audio-apimediaelementaudiobuffer

Why is it more suitable to use a MediaElementAudioSourceNode for longer sounds?


Complete question: Why is it more suitable to use a MediaElementAudioSourceNode rather than an AudioBuffer for longer sounds?

From MDN:

Objects of these types are designed to hold small audio snippets, typically less than 45 s. For longer sounds, objects implementing the MediaElementAudioSourceNode are more suitable.

From the specification:

This interface represents a memory-resident audio asset (for one-shot sounds and other short audio clips). Its format is non-interleaved 32-bit linear floating-point PCM values with a normal range of [−1,1][−1,1], but values are not limited to this range. It can contain one or more channels. Typically, it would be expected that the length of the PCM data would be fairly short (usually somewhat less than a minute). For longer sounds, such as music soundtracks, streaming should be used with the audio element and MediaElementAudioSourceNode.

  1. What are the benefits of using a MediaElementAudioSourceNode over of an AudioBuffer?
  2. Are there any disadvantages when using a MediaElementAudioSourceNode for short clips?

Solution

    1. MediaElementSourceNode has the potential ability to stream - and certainly to start playing before the entire sound file has been downloaded and decoded. It also has the ability to do this without converting (likely expanding!) the sound file to 32-bit linear PCM (CD quality audio would only be 16 bits per channel) and transcoding to the output device sample rate. For example, a 1-minute podcast recorded at 16-bit, 16kHz would be just under 2 megabytes in size natively; if you're playing back on a 48kHz device (not uncommon), the transcoding to 32-bit 48kHz would mean you're using up nearly 12 megabytes as an AudioBuffer.

    2. MediaElementSourceNode won't give you precise playback timing, or the ability to manage/playback lots of simultaneous sounds. The precision may be reasonable for your use case, but it won't be sample-accurate timing like AudioBuffer can have.