Search code examples
bashbase64google-speech-api

Google Speech bash script with base64 : Unexpected token.\n


I use the following code : https://github.com/sararob/ml-talk-demos/blob/master/speech/request.sh to fit my own bash script.

cat <<EOF > $JSONFILENAME
{
  "config": {
    "encoding":"LINEAR16",
    "sampleRateHertz":8000,
    "languageCode": "nl-NL",
    "speechContexts": {
      "phrases": ['']
    },
    "maxAlternatives": 1
  },
  "audio": {
    "content":
    }
}
EOF
base64 $1 -w 0 > $SOUNDFILE.base64
#MYBASE64=$(base64 $1 -w 0)
sed -i $JSONFILENAME -e "/\"content\":/r $SOUNDFILE.base64"
#sed -i $JSONFILENAME -e "/\"content\":/r $MYBASE64"
curl -s -X POST -H "Content-Type: application/json" --data-binary @${JSONFILENAME} https://speech.googleapis.com/v1/speech:recognize?key=$API_KEY

The base64 output is correctly filled in by the sed command, however there are also newlines added.

This is the Google API response :

{
  "error": {
    "code": 400,
    "message": "Invalid JSON payload received. Unexpected token.\n\": {\n    \"content\":\nUklGRqTIAgBXQVZFZm10\n                    ^",
    "status": "INVALID_ARGUMENT"
  }
}

How can I make sure the "content" in my JSON-object is a continuous string of base64 ?


Solution

  • You should avoid updating JSON data with sed.

    If you have valid JSON data (i.e you have to fix the lines "phrases": [] and "content": "", you could use jq instead:

    jq ".audio.content = \"$(base64 -w 0 "$1")\"" "$JSONFILENAME"
    

    I don't recommend sed, but in this case where a large entry must be appended, you could try this:

    echo \"$(base64 -w 0 "$1")\" > "$SOUNDFILE.base64"
    sed -i "$JSONFILENAME" -e "/\"content\":/r $SOUNDFILE.base64"
    

    The google error you receive is likely due to the fact that the string is not double quoted.