I've been tinkering around with the SpeechSynthesisUtterance
API in JavaScript and have been trying to segment each spoken word by the interpreter into a differently fired anonymous function.
Say I have the following sentence spoken through the API:
var message = new SpeechSynthesisUtterance('one two three');
window.speechSynthesis.speak(message);
It will output, one two three, although that speed is fine, I am after the ability to attach a function which will fire for the beginning of each of the words, so:
* function is called with parameter "one" // starts speaking "one"
* function is called with parameter "two" // starts speaking "two"
* function is called with parameter "three" // starts speaking "three"
I have tried to segment these into three different words, for example being spoken at the same time:
var message1 = new SpeechSynthesisUtterance('one');
var message2 = new SpeechSynthesisUtterance('two');
var message3 = new SpeechSynthesisUtterance('three');
window.speechSynthesis.speak(message1);
window.speechSynthesis.speak(message2);
window.speechSynthesis.speak(message3);
But this slowly outputs "one...... two...... three" - although this setup would be ideal cause I could attach the onstart
or onend
firing events found in the documentation.
Sounds like you want to use the SpeechSynthesisUtterance.onboundary
event:
var message = new SpeechSynthesisUtterance('one two three');
message.onboundary = (e => console.log(e));
window.speechSynthesis.speak(message);
The event has a charIndex
property that tells you where the boundary falls in the utterance. It's up to you to read forward from that point up to the next word boundary to determine the word:
console.log(e.target.text.substr(e.charIndex).match(/^.+?\b/)[0]);