Search code examples
node.jsfirebasespeech-recognitiongoogle-cloud-speech

Is it possible to use .m4a files with Google Speech to Text API?


I recently filed an issue asking Google about this, but the answer is quite confusing. The person who answered said that it is possible if you specify "MP3" as encoding.

I tried that and it did not work.

However the person at Google closed the issue. So I really do not know how to proceed.

https://issuetracker.google.com/issues/166478543

My understanding is that the encoding in my .m4a file is not MP3 and that the person who answered got this a bit wrong.

(I also got some nice advice not to use .m4a. But this is not an option in my case since I am not producing the files. I have no influence whatsoever over that. Unfortunately.)

Is there someone here who can clarify if Google Speech to Text API can handle .m4a? (I have added some tags to clarify the environment.)


Solution

  • If I was in your position, I would use https://www.npmjs.com/package/audiobuffer-to-wav to convert the M4A to WAV, then use the WAV file which google SR accepts easily.