I am working on getting audio to record on a raspberry pi, but have run into an issue. When using paFloat32 in PyAudio, and np.frombuffer(np.float32), I get good audio. But if I use paInt16 and int16, I get garbage static.
Minimum code is here
import pyaudio
import numpy as np
import scipy.io.wavfile as wf
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paFloat32,
channels=2,
rate=16000,
input=True,
input_device_index=0,
frames_per_buffer=1024)
frames = [[],[]] # Initialize array to store frames
print("recording...")
for i in range(0, 128):
data = stream.read(1024)
decoded = np.frombuffer(data, np.float32)
decodedSplit = np.stack((decoded[::2], decoded[1::2]), axis=0) # channels on separate axes
frames = np.append(frames, decodedSplit, axis=1)
stream.close()
p.terminate()
wf.write('test.wav', 16000, frames.T)
From what I understand, changing paFloat32 to paInt16 and np.float32 to np.int16 is all that I need to do, but it does not work. Is there a possible issue with the sound card or is something misconfigured? I have stared at this for 2 days now and am stuck. Again, float works perfectly, but the next chunk of code relies on someone else's library written for int16
Rather than try and kick float32 to work, I found a float to pcm int 16 converter at HudsonHuang's Github
From this, I was able to rewrite the recording method to use the following format
import pyaudio
import numpy as np
import scipy.io.wavfile as wf
# From https://gist.github.com/HudsonHuang/fbdf8e9af7993fe2a91620d3fb86a182
def float2pcm(sig, dtype='int16'):
"""Convert floating point signal with a range from -1 to 1 to PCM.
Any signal values outside the interval [-1.0, 1.0) are clipped.
No dithering is used.
Note that there are different possibilities for scaling floating
point numbers to PCM numbers, this function implements just one of
them. For an overview of alternatives see
http://blog.bjornroche.com/2009/12/int-float-int-its-jungle-out-
there.html
Parameters
----------
sig : array_like
Input array, must have floating point type.
dtype : data type, optional
Desired (integer) data type.
Returns
-------
numpy.ndarray
Integer data, scaled and clipped to the range of the given
*dtype*.
See Also
--------
pcm2float, dtype
"""
sig = np.asarray(sig)
if sig.dtype.kind != 'f':
raise TypeError("'sig' must be a float array")
dtype = np.dtype(dtype)
if dtype.kind not in 'iu':
raise TypeError("'dtype' must be an integer type")
i = np.iinfo(dtype)
abs_max = 2 ** (i.bits - 1)
offset = i.min + abs_max
return (sig * abs_max + offset).clip(i.min, i.max).astype(dtype)
p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paFloat32,
channels=2,
rate=16000,
input=True,
input_device_index=0,
frames_per_buffer=1024)
linearFrames = []
frames = [[], []] # Initialize array to store frames
print("recording...")
for i in range(0, 128):
data = stream.read(1024)
decoded = np.frombuffer(data, np.float32)
decoded = float2pcm(decoded, 'int16')
linearFrames = np.append(linearFrames, decoded)
linearFrames = linearFrames.astype(np.int16)
decodedSplit = np.stack((linearFrames[::2], linearFrames[1::2]), axis=0) # channels on separate axes
stream.close()
p.terminate()
wf.write('test.wav', 16000, decodedSplit.T.astype(np.int16))