Search code examples
apispeechgoogle-speech-apiopus

Google Speech API Empty Answer


For tests I used the Google Example of the speech api (https://cloud.google.com/speech-to-text/docs/reference/rest/v1/speech/recognize) There I tried a .ogg file This one (https://www.dropbox.com/s/lw66x3g143mtnsl/SpeechToText.ogg?dl=0) I converted the audio file to 16000Hz Here is the full request

{
  "audio": {
  "content": " content "
  },
  "config": {
  "encoding": "OGG_OPUS",
  "languageCode": "de-DE",
  "sampleRateHertz": 16000
  }
}

I converted then the aduio file with an Base64 Encoder (https://www.giftofspeed.com/base64-encoder/) So the content was too long for here. Now my problem I get just an empty answer. I get the code 200 but nothing else

Thanks for all answers !


Solution

  • The .ogg file URL you referenced was encoded with codec Vorbis not Opus. You can use opus-tools to encode your audio file to an Opus file before you provide it to Google's service

    Here's the debugging I used to identify your file as Vorbis:

    opusinfo

    $ opusinfo SpeechToText.ogg 
    Processing file "SpeechToText.ogg"...
    
    Use ogginfo for more information on this file.
    New logical stream (#1, serial: ffe6c0ca): type Vorbis
    Logical stream 1 ended
    

    ffmpeg

    $ ffmpeg -i SpeechToText.ogg 
    ffmpeg version 3.4.2 Copyright (c) 2000-2018 the FFmpeg developers
    Input #0, ogg, from 'SpeechToText.ogg':
      Duration: 00:00:03.41, start: 0.000000, bitrate: 116 kb/s
        Stream #0:0: Audio: vorbis, 16000 Hz, stereo, fltp, 160 kb/s
        Metadata:
          ENCODER         : Lavc58.18.100 libvorbis