pyaudio stream.read returns static when in int16, returns good audio in float32

I am working on getting audio to record on a raspberry pi, but have run into an issue. When using paFloat32 in PyAudio, and np.frombuffer(np.float32), I get good audio. But if I use paInt16 and int16, I get garbage static.

Minimum code is here

import pyaudio
import numpy as np
import scipy.io.wavfile as wf

p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paFloat32,
            channels=2,
            rate=16000,
            input=True,
            input_device_index=0,
            frames_per_buffer=1024)
frames = [[],[]]  # Initialize array to store frames

print("recording...")

for i in range(0, 128):
    data = stream.read(1024)
    decoded = np.frombuffer(data, np.float32)
    decodedSplit = np.stack((decoded[::2], decoded[1::2]), axis=0)  # channels on separate axes
    frames = np.append(frames, decodedSplit, axis=1)



stream.close()
p.terminate()

wf.write('test.wav', 16000, frames.T)

From what I understand, changing paFloat32 to paInt16 and np.float32 to np.int16 is all that I need to do, but it does not work. Is there a possible issue with the sound card or is something misconfigured? I have stared at this for 2 days now and am stuck. Again, float works perfectly, but the next chunk of code relies on someone else's library written for int16

Solution

Rather than try and kick float32 to work, I found a float to pcm int 16 converter at HudsonHuang's Github

From this, I was able to rewrite the recording method to use the following format

import pyaudio
import numpy as np
import scipy.io.wavfile as wf



# From https://gist.github.com/HudsonHuang/fbdf8e9af7993fe2a91620d3fb86a182
def float2pcm(sig, dtype='int16'):
    """Convert floating point signal with a range from -1 to 1 to PCM.
    Any signal values outside the interval [-1.0, 1.0) are clipped.
    No dithering is used.
    Note that there are different possibilities for scaling floating
    point numbers to PCM numbers, this function implements just one of
    them.  For an overview of alternatives see
    http://blog.bjornroche.com/2009/12/int-float-int-its-jungle-out-                       
there.html
    Parameters
    ----------
    sig : array_like
        Input array, must have floating point type.
    dtype : data type, optional
        Desired (integer) data type.
    Returns
    -------
    numpy.ndarray
        Integer data, scaled and clipped to the range of the given
        *dtype*.
    See Also
    --------
    pcm2float, dtype
    """
    sig = np.asarray(sig)
    if sig.dtype.kind != 'f':
        raise TypeError("'sig' must be a float array")
    dtype = np.dtype(dtype)
    if dtype.kind not in 'iu':
        raise TypeError("'dtype' must be an integer type")

i = np.iinfo(dtype)
abs_max = 2 ** (i.bits - 1)
offset = i.min + abs_max
return (sig * abs_max + offset).clip(i.min, i.max).astype(dtype)

p = pyaudio.PyAudio()
stream = p.open(format=pyaudio.paFloat32,
            channels=2,
            rate=16000,
            input=True,
            input_device_index=0,
            frames_per_buffer=1024)
linearFrames = []
frames = [[], []]  # Initialize array to store frames

print("recording...")

for i in range(0, 128):
    data = stream.read(1024)
    decoded = np.frombuffer(data, np.float32)

    decoded = float2pcm(decoded, 'int16')

    linearFrames = np.append(linearFrames, decoded)


linearFrames = linearFrames.astype(np.int16)
decodedSplit = np.stack((linearFrames[::2], linearFrames[1::2]), axis=0)  # channels on separate axes

stream.close()
p.terminate()

wf.write('test.wav', 16000, decodedSplit.T.astype(np.int16))