I am confused by how pydub
computes rms.
In [187]: audio = AudioSegment.from_mp3("sample-mp3")
In [188]: audio.rms
Out[188]: 1041
In [189]: audio.dBFS
Out[189]: -29.959984108983633
However using sox
:
$ sox sample.mp3 -n stat
Samples read: 130231296
Length (seconds): 1476.545306
Scaled by: 2147483647.0
Maximum amplitude: 1.000000
Minimum amplitude: -1.000000
Midline amplitude: -0.000000
Mean norm: 0.017384
Mean amplitude: -0.000023
**RMS amplitude: 0.031763**
Maximum delta: 1.308396
Minimum delta: 0.000000
Mean delta: 0.015841
RMS delta: 0.028429
Rough frequency: 6282
Volume adjustment: 1.000
Can anyone enlighten me please on how these rms values are computed?? Thx.
They represent the same value, just on different scales. pydub
appears to work with signed 16-bit values (maybe because of the 16-bit depth of the mp3 file?), while SoX by default scales the internal 32-bit signed values to [-1,1]. You can bring the two outputs in to congruency by scaling by 2^15, or by telling SoX to use a signed 16-bit scale by using the -s
argument. As 2^31/2^15 is 2^16, that should be -s 65536
.