Search code examples
pythonaudiogoogle-speech-apipydubgoogle-cloud-speech

Exporting Audio for Google Speech using pydub


I'm trying to export audio files to LINEAR16 for Google Speech and I notice that they specify little-endian byte ordering. I'm using pydub to export to 'raw' format, but I can't tell from the documentation (or the source) whether the exported files are in little or big endian format? I'm using the following command for exporting:

audio = pydub.from_file(self.mFilePathName, "mp4")
fullFileNameRaw = "audio.raw"
audio.export(fullFileNameRaw, format='raw')

Thank you. -K


Solution

  • According to this answer, standard (RIFF) wave files are little endian. Pydub uses the stdlib wavemodule to write wave files, so I'm guessing it is little endian. (if you write the file with the wave headers it does in fact have RIFF at the beginning).

    Looking into it a little further though, it seems like it may depend on the hardware platform's endianness. x86 and AMD64 are both little endian though so that covers basically all the places people would run pydub (I think?)