Search code examples
pythonaudio-recordingpyaudiopydub

Type Conversion from PyAudio str to AudioSegment returns errors


I'm working on creating an embedded compression system, similar to those found on professional audio mixers. I am capturing the audio samples using PyAudio via the given "wire" example.

What's Supposed to Happen

Those samples are sectioned into "chunks", thanks to the library and streamed shortly after recording. I'm simply attempting to compress the chunks if the incoming signal becomes too loud. However, there are mismatched types.

The types which are being used are:

  • data = samples from the stream <type 'str'> - Unicode string
  • chunk = batch of audio bytes <type 'int'> - always returns 1024
  • stream.write(data, chunk) <type 'NoneType'>
  • compressed_segment = to be compressed <class 'pydub.audio_segment.AudioSegment'>

What's Happening

PyAudio returns as a string from the method stream.read() which is stored in data. I need the ability to convert these string samples to the AudioSegment object in order to use the compression function.

As a result, what ends up happening is I get several errors related to the type conversion, depending on how I have everything setup. I know that it's not the right type. So how can I make this type conversion work?

Here's 2 ways I've tried to do the conversion within the for i in range loop

1. Creating a "wave" object before compression

wave_file = wave.open(f="compress.wav", mode="wb")
wave_file.writeframes(data)
frame_rate = wave_file.getframerate()
wave_file.setnchannels(2)
# Create the proper file
compressed = AudioSegment.from_raw(wave_file)
compress(compressed) # Calling compress_dynamic_range in Pydub

Exception wave.Error: Error('# channels not specified',) in <bound method Wave_write.del of <wave.Wave_write instance at 0x000000000612FE88>> ignored

2. Sending RAW PyAudio data to compress method

data = stream.read(chunk)
compress(chunk) # Calling compress_dynamic_range in Pydub

thresh_rms = seg.max_possible_amplitude * db_to_float(threshold) AttributeError: 'int' object has no attribute 'max_possible_amplitude'


Solution

  • The first error which was thrown because the wave file was written to before # of channels was set can be fixed as follows:

    # inside for i in range loop 
    wave_file = wave.open(f="compress.wav(%s)" %i, mode="wb")
    wave_file.setnchannels(channels)
    wave_file.setsampwidth(sample_width)
    wave_file.setframerate(sample_rate)
    wave_file.writeframesraw(data) # place this after all attributes are set
    wave_file.close()
    
    # send temp files to compressor
    compressed = AudioSegment.from_raw(wave_file)
    compress(compressed)
    

    This can then be sent to the PyDub funciton compress_dynamic_range.

    However...

    A more efficient way to do this - which is without creating the temp wav files - is to create a simple AudioSegment object in the following way. One can also stream back to PyAudio the compressed sound using stream.write().

    sound = AudioSegment(data, sample_width=2, channels=2, frame_rate=44100)
    stream.write(sound.raw_data, chunk) # stream via speakers / headphones