I just started to play with Google Text-To-Speech API. I generated a post request to:
with the following data:
"input": {
"text": "Hola esto es una prueba"
"voice": {
"languageCode": "es-419"
"audioConfig": {
"audioEncoding": "LINEAR16",
"speakingRate": 1,
"pitch": 0
and I got a 200 response, with the content:
"audioContent" : "UklGRn6iCwBXQVZFZm10I...(super long string)"
I am assuming this is encoded (or decoded, not sure about the naming), but I would like to actually hear what is that "audioContent".
As Tanaike pointed out, the response is indeed Base64. To actually listen the audio, I pasted the base64 encoded string into a file, then ran:
base64 -d audio.txt > audio.wav
and that made the trick.