Search code examples
pythonaudiosample-rate

Python wave audio sample rate


I am trying to tie together javascript front end, flask server and microsoft's cognitive services for audio identification.

Microsoft's server requests audio data to be with specific parameters, particularly it requests 16000 framerate\frequency.

But from the browser on windows I can only get 41000. Now, I get audio at 41000, and then save it like this:

audioData = message['audio']     
af = wave.open('audioData.wav', 'w')
af.setnchannels(1)
af.setparams((1, 2, 16000, 0, 'NONE', 'Uncompressed'))
af.writeframes(audioData)
af.close()

Audio is received through socketio in form of a dict\json data. If I save it directly without changing anything, it sounds fine. But If I change the sample rate to 16000, it obviously sounds distorted and very slow, so a few seconds of audio stretch into a minute or so.

How do I correctly change the audio rate witohut affecting how it sounds in Python 3.4?

Thanks.

EDIT: Here is the working code:

with open("audioData_original.wav", 'wb') as of:
of.write(message['audio'])
audioFile = wave.open("audioData_original.wav", 'r')
n_frames = audioFile.getnframes()
audioData = audioFile.readframes(n_frames)
originalRate = audioFile.getframerate()
af = wave.open('audioData.wav', 'w')
af.setnchannels(1)
af.setparams((1, 2, 16000, 0, 'NONE', 'Uncompressed'))
converted = audioop.ratecv(audioData, 2, 1, originalRate, 16000, None)
af.writeframes(converted[0])
af.close()
audioFile.close()

The downside here is that even though I get audio data from mediaRecorder Api through json, so I have it in memory... And I write it down on disk, and open it again to be able to get the sampling rate using wave's functions. But how do I do it without writing it to disk? Thanks. If I have to make a new question for that, sure, can do that.

EDIT2: Oh, ok, answering my own follow-up question - io.BytesIO did the trick.


Solution

  • Have a look at audioop.ratecv (it's in the standard library) Let it operate on the raw frames of your sample (in your case, audioData). It's a simple algorithm so expect some sound quality loss, but I guess for speech that is insignificant.