Given a plain web video of say 30s:
<video src="my-video.mp4"></video>
How could I generate its volume level chart?
volume|
level| ******
| * * **
| * * * **
|** * *** *
| ** * * * *
+---------------*-*-----************------+--- time
0 30s
video is and quiet
loud here here
Note:
There are several ways to do this depending on what the usage is.
For accuracy you could measure in conventional volumes and units such as RMS, LUFS/LKFS (K-weighted, loudness), dBFS (full-scale dB) and so forth.
The simple naive approach is to just plot the peaks of the waveform. You would be interested in the positive values only. To just get the peaks you would detect direction between two points and log the first point when the direction changes from upward to downwards (p0 > p1).
For all approaches you can finally apply some form of smoothing such as weighted moving average (example) or a generic smoothing algorithm to remove small peaks and changes, in case of RMS, dB etc. you would use a window size which can be combined with bin-smoothing (an average per segment).
To plot you will obtain the value for the current sample, assume it to be normalized and draw it as line or point to canvas scaled by plot area height.
To address some of the questions in the comments; these are just off the top of my heads to give some pointers -
Since Web Audio API cannot do streaming on its own you have to load the entire file into memory and decode the audio track into a buffer.
*: There is always the option of storing the downloaded video as blob in IndexedDB (with its implications) and use an Object-URL with that blob to stream in the video element (may require MSE to work properly, haven't tried myself).
Plotting while streaming:
Side-loading a low-quality mono audio-only file:
Server-side plotting:
I've might left out or missed some points, but it should give the general idea...
This example measures conventional dB of a given window size per sample. The bigger the window size the smoother the result, but will also take more time to calculate.
Note that for simplicity in this example pixel position determines the dB window range. This may produce uneven gaps/overlaps depending on buffer size affecting the current sample value, but should work for the purpose demonstrated here. Also for simplicity I am scaling the dB reading by dividing it by 40, a somewhat arbitrary number here (ABS is just for the plotting and the way my brain worked (?) in the late night/early morning when I made this :) ).
I added bin/segment-smoothing in red on top to better show longer-term audio variations relevant to things such as auto-leveling.
I'm using a audio source here but you can plug in a video source instead as long as it contains an audio track format that can be decoded (aac, mp3, ogg etc.).
Besides from that, the example is just that, an example. It's not production code so take it for what it is worth. Make adjustments as needed.
(for some reason the audio won't play in Firefox v58beta, it will plot though. Audio plays in Chrome, FF58dev).
var ctx = c.getContext("2d"), ref, audio;
var actx = new (AudioContext || webkitAudioContext)();
var url = "//dl.dropboxusercontent.com/s/a6s1qq4lnwj46uj/testaudiobyk3n_lo.mp3";
ctx.font = "20px sans-serif";
ctx.fillText("Loading and processing...", 10, 50);
ctx.fillStyle = "#001730";
// Load audio
fetch(url, {mode: "cors"})
.then(function(resp) {return resp.arrayBuffer()})
.then(actx.decodeAudioData.bind(actx))
.then(function(buffer) {
// Get data from channel 0 (you will want to measure all/avg.)
var channel = buffer.getChannelData(0);
// dB per window + Plot
var points = [0];
ctx.clearRect(0, 0, c.width, c.height);
ctx.moveTo(x, c.height);
for(var x = 1, i, v; x < c.width; x++) {
i = ((x / c.width) * channel.length)|0; // get index in buffer based on x
v = Math.abs(dB(channel, i, 8820)) / 40; // 200ms window, normalize
ctx.lineTo(x, c.height * v);
points.push(v);
}
ctx.fill();
// smooth using bins
var bins = 40; // segments
var range = (c.width / bins)|0;
var sum;
ctx.beginPath();
ctx.moveTo(0,c.height);
for(x = 0, v; x < points.length; x++) {
for(v = 0, i = 0; i < range; i++) {
v += points[x++];
}
sum = v / range;
ctx.lineTo(x - (range>>1), sum * c.height); //-r/2 to compensate visually
}
ctx.lineWidth = 2;
ctx.strokeStyle = "#c00";
ctx.stroke();
// for audio / progressbar only
c.style.backgroundImage = "url(" + c.toDataURL() + ")";
c.width = c.width;
ctx.fillStyle = "#c00";
audio = document.querySelector("audio");
audio.onplay = start;
audio.onended = stop;
audio.style.display = "block";
});
// calculates RMS per window and returns dB
function dB(buffer, pos, winSize) {
for(var rms, sum = 0, v, i = pos - winSize; i <= pos; i++) {
v = i < 0 ? 0 : buffer[i];
sum += v * v;
}
rms = Math.sqrt(sum / winSize); // corrected!
return 20 * Math.log10(rms);
}
// for progress bar (audio)
function start() {if (!ref) ref = requestAnimationFrame(progress)}
function stop() {cancelAnimationFrame(ref);ref=null}
function progress() {
var x = audio.currentTime / audio.duration * c.width;
ctx.clearRect(0,0,c.width,c.height);
ctx.fillRect(x-1,0,2,c.height);
ref = requestAnimationFrame(progress)
}
body {background:#536375}
#c {border:1px solid;background:#7b8ca0}
<canvas id=c width=640 height=300></canvas><br>
<audio style="display:none" src="//dl.dropboxusercontent.com/s/a6s1qq4lnwj46uj/testaudiobyk3n_lo.mp3" controls></audio>