Search code examples
google-apigoogle-speech-apigoogle-speech-to-text-api

Google Speech to Text API gives a 00:00 audio length


I have an audio clip that is about 40 minutes long. I uploaded it to GCS and use the URI for audio configuration. The audio assessment gave an estimated duration of 00:00, which is apparently wrong (see image below). The transcription result is empty as well.

This shows the audio assessment that gave 00:00.

The file was originally in .m4a format. I changed it to other formats (.wav and .flac), but they also gave a 00:00 length. The API only worked when I trimmed the audio file to the first 40 seconds and 100 seconds. It failed when I trimmed the audio file to the first 10 minutes.

Please advise if you have any idea about this problem. Thanks!


Solution

  • I solved my problem by converting the .m4a file with an online tool, e.g. convertio.co to a FLAC file. Then the API works like a charm.

    Originally, I directly changed the extension m4a to flac in my Windows computer. It turns out that this does not do the trick.