Search code examples
pythonnumpypyaudio

pyaudio stream.read returns static when in int16, returns good audio in float32


I am working on getting audio to record on a raspberry pi, but have run into an issue. When using paFloat32 in PyAudio, and np.frombuffer(np.float32), I get good audio. But if I use paInt16 and int16, I get garbage static.

Minimum code is here

import pyaudio
import numpy as np
import scipy.io.wavfile as wf

p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paFloat32,
            channels=2,
            rate=16000,
            input=True,
            input_device_index=0,
            frames_per_buffer=1024)
frames = [[],[]]  # Initialize array to store frames

print("recording...")

for i in range(0, 128):
    data = stream.read(1024)
    decoded = np.frombuffer(data, np.float32)
    decodedSplit = np.stack((decoded[::2], decoded[1::2]), axis=0)  # channels on separate axes
    frames = np.append(frames, decodedSplit, axis=1)



stream.close()
p.terminate()

wf.write('test.wav', 16000, frames.T)

From what I understand, changing paFloat32 to paInt16 and np.float32 to np.int16 is all that I need to do, but it does not work. Is there a possible issue with the sound card or is something misconfigured? I have stared at this for 2 days now and am stuck. Again, float works perfectly, but the next chunk of code relies on someone else's library written for int16


Solution

  • Rather than try and kick float32 to work, I found a float to pcm int 16 converter at HudsonHuang's Github

    From this, I was able to rewrite the recording method to use the following format

    import pyaudio
    import numpy as np
    import scipy.io.wavfile as wf
    
    
    
    # From https://gist.github.com/HudsonHuang/fbdf8e9af7993fe2a91620d3fb86a182
    def float2pcm(sig, dtype='int16'):
        """Convert floating point signal with a range from -1 to 1 to PCM.
        Any signal values outside the interval [-1.0, 1.0) are clipped.
        No dithering is used.
        Note that there are different possibilities for scaling floating
        point numbers to PCM numbers, this function implements just one of
        them.  For an overview of alternatives see
        http://blog.bjornroche.com/2009/12/int-float-int-its-jungle-out-                       
    there.html
        Parameters
        ----------
        sig : array_like
            Input array, must have floating point type.
        dtype : data type, optional
            Desired (integer) data type.
        Returns
        -------
        numpy.ndarray
            Integer data, scaled and clipped to the range of the given
            *dtype*.
        See Also
        --------
        pcm2float, dtype
        """
        sig = np.asarray(sig)
        if sig.dtype.kind != 'f':
            raise TypeError("'sig' must be a float array")
        dtype = np.dtype(dtype)
        if dtype.kind not in 'iu':
            raise TypeError("'dtype' must be an integer type")
    
    i = np.iinfo(dtype)
    abs_max = 2 ** (i.bits - 1)
    offset = i.min + abs_max
    return (sig * abs_max + offset).clip(i.min, i.max).astype(dtype)
    
    p = pyaudio.PyAudio()
    stream = p.open(format=pyaudio.paFloat32,
                channels=2,
                rate=16000,
                input=True,
                input_device_index=0,
                frames_per_buffer=1024)
    linearFrames = []
    frames = [[], []]  # Initialize array to store frames
    
    print("recording...")
    
    for i in range(0, 128):
        data = stream.read(1024)
        decoded = np.frombuffer(data, np.float32)
    
        decoded = float2pcm(decoded, 'int16')
    
        linearFrames = np.append(linearFrames, decoded)
    
    
    linearFrames = linearFrames.astype(np.int16)
    decodedSplit = np.stack((linearFrames[::2], linearFrames[1::2]), axis=0)  # channels on separate axes
    
    stream.close()
    p.terminate()
    
    wf.write('test.wav', 16000, decodedSplit.T.astype(np.int16))