Record audio and microphone stream as different MediaRecorder tracks at the same time

I'm trying to record the user's response (via microphone) to some audio content, such that I can analyse the client-side timing of the user's response to this audio in an precise way. Ideally, I'd record two tracks along the same timeline: (1) the microphone stream from the user; and (2) the audio stream as heard by the user.

I am not experienced with the Web Audio API, but using some previous SO answers I arrived at the solution below: I connect the audio source (source) and microphone source (stream) in a single stream (combinedStream), which is fed to a MediaRecorder.

My questions:

  1. This records a single track (i.e. the audio and microphone signals must be separate using post-processing). Is it possible to record them into two tracks? e.g. crudely as the two channels of a stereo signal?

  2. It is not clear to me whether this is the most latency-sensitive approach, maybe there is a overhead associated with connecting the streams, or an un-captured latency associated with the actual audio playback for the client? Any advice would be appreciated - currently there is a ~10-20ms latency between audio source and playback (measured crudely by looking at the delay between audio stream and playback through speakers, as picked up on microphone stream).

  3. I don't know much about HTML5 Audio, but maybe there is a better solution using it?



      // Audio for playback
      var source = context.createBufferSource();
      source.buffer = ...

      // Merge audio source with microphone stream
      const mediaStreamDestination = audioContext.createMediaStreamDestination();
      const sourceMic = jsPsych.pluginAPI.audioContext().createMediaStreamSource(stream);
      let combinedStream = new MediaStream([]);
      // Media recorder
      mediaRecorder = new MediaRecorder(combinedStream);
      mediaRecorder.ondataavailable = function(event) {



  • To record two different audio sources on two different channels (e.g right and left in a stereo file), you can use a ChannelMergerNode.

    Basically it's the same setup as yours, except that when connecting both sources you'd set the output channel from the connect( destination, input_channel, output_channel ) method:

    Using two oscillators:

    onclick = ()=>{
      onclick = null;
      const ctx = new AudioContext();
      const osc1 = ctx.createOscillator();
      const osc2 = ctx.createOscillator();
      osc1.frequency.value = 300;
      const merger = ctx.createChannelMerger();
      const dest = ctx.createMediaStreamDestination();
      merger.connect( dest );
      osc1.connect( merger, 0, 0 );
      osc2.connect( merger, 0, 1 );
      // for nodes to output sound in Chrome
      // they need to be connected to the destination
      // ...
      const mute = ctx.createGain();
      mute.gain.value = 0;
      mute.connect( ctx.destination );
      osc1.connect( mute );
      osc2.connect( mute );
      const chunks = [];
      const rec = new MediaRecorder( );
      rec.ondataavailable = e => chunks.push(
      rec.onstop = e => {
        output.src = URL.createObjectURL( new Blob( chunks ) );
      setTimeout( () => rec.stop(), 5000 );
    <p id="log">click to start recording of 5s sample</p>
    <audio id="output" controls></audio>

