I'm trying to create a Children's Animation / Short Story and put it on YouTube. I'm hoping to use Google Translate and Text-To-Speech to generate different language versions.
Since I'll need voices to express different emotions, I was wondering whether there is a way to do the following:
Can Google's Text-To-Speech allow for this customization? Thanks.
In Google Text-To-Speech it is not possible to assign emotions to voices. Currently the only options for voices are adult male and female voices in different languages. See available voice list here. There are some voices in the list that are using the WaveNet model which makes the voice sound like a real adult person.
The customization that Google's Text-To-Speech uses Speech Synthesis Markup Language (SSML) and is currently limited to providing details on pauses, and audio formatting for acronyms, dates, times, abbreviations, or text that should be censored.
I suggest to explore other Text-to-Speech providers to fit your use case.