According to the documentation for the client libraries, there is a Ruby client library the google-cloud_text_to_speech gem. But that's all that specific client library gives documentation on--gem install.
The next step in the above documentation is to make the actual request but it is only available in Go, Java, Node.js and Python as an example.
I think I am able to translate this into Ruby by reading the Node.js for Ruby on Rails. But, I'm unclear where to set the environment variable. For Google Cloud Storage under Active Storage, I use a key file for a service account which is called this way:
credentials: <%= Rails.root.join("google-credentials.json") %>
This JSON file is of course in my .gitignore. But is set on Heroku as well.
Based on the Node.js, I am suspecting a translation along these lines--but do not see where I put the authentication information.
def synthesize_speech
# Create the instance of client:
client =
# Define the text that will be synthesized.
text = self.content
#Define the Request
request = {
input: {
text: text
voice: {
languageCode: "en-US",
name: "en-US-Studio-M",
ssmlGender: "MALE"
audioConfig: {
audioEncoding: "MP3"
# Get the response
response = client.synthesizeSpeech(request)
response = JSON.parse(response.body)
# OPTION 1: Extract the base64 string and put it in the database:
self.base64 = response["audioContent"]
# OPTION 2: Extract the base64 and write it to a temp file
base64 = "/tmp/base64.txt", "w") do |f|
f.puts response["audioContent"]
Once I get the above working I can worry about the conversion to .mp3 file. So what's needed here is to figure out how to authenticate the API.
If you use the official Ruby client from Google, you can find more information here
client = Google::Cloud::TextToSpeech.text_to_speech do |config|
config.credentials = Rails.root.join("google-credentials.json")
response = client.synthesize_speech(
input: { text: 'Lorem ipsum' },
voice: { name: 'en-US-Studio-M', language_code: 'en-US' },
audio_config: {audio_encoding: 'MP3'}
Note that the audio_content
is a binary.
So if you want to write it to an MP3 file, you need to use wb
instead of just w
when opening the file:"test.mp3", "wb") do |file|
file.write response.audio_content