I need to manipulate some .wav
files, and I am using the scipy.io.wavfile
module to help me with this task.
I ran into a problem when I tried to understand how the read
and write
functions work.
I have a sample file input_file.wav
. The code I wrote that worked as expected was:
def scale(filename):
fs, x = wavfile.read(filename)
wavfile.write('test_output.wav', fs, x)
return
scale('input_file.wav')
The input and output files looked identical when I imported them into Audacity, and sounded identical on my headphones. I ran into issues when I executed the following code.
def scale(filename):
fs, x = wavfile.read(filename)
x1 = x * 0.5
wavfile.write('test_output1.wav', fs, x1)
return
scale('input_file.wav')
I expected that the output would be half as loud (since I multiplied the value of each sample by 0.5. But when I imported it into Audacity, the file was loud to the point of severe distortion.
The same thing happened when I multipled by 1.01
, 1.0001
, 0.1
, and a number of other values I tried - massively boosted volume to the point of large distortions.
The file started to sound identical (and look identical when imported into Audacity) when I multiplied the sample array by a value of 1/32767
or so (which is 1/(2^15-1)
). This is strange because the values in the sample array returned by the read()
function are definitely not identical.
Why do the output files from the write operation sound the same when the scaling value is either 1 or 1/32767, two very different numbers?
Any help would be appreciated, thank you.
EDIT: If it helps, the output of x.dtype
(the dtype
attribute of the sample array returned by read()
is int16
).
If x
has dtype
np.int16
, then x1
has dtype
np.float64
. It appears that scipy.io.wavfile.write
attempts to write 64 bit floating values to the file, even though the documentation only mentions 32 bit floating point formats. You can work around the problem by converting x1
to int16
, or by normalizing the values in x1
to the range [-1, 1] (or [-0.5, 0.5], or to whatever range you want in [-1, 1]). That is, you can use
wavfile.write('test_output1.wav', fs, np.round(x1).astype(x.dtype)) # If x has an integer dtype
or
wavfile.write('test_output1.wav', fs, (x1/2**15).astype(np.float32))