Search code examples
javascripthtmlspeech-synthesiswebspeech-api

WebSpeech Speech Synthesis: Pausing utterance1, playing another utterance2, and resuming utterance1 - possible?


I am using WebSpeech's speechSynthesis module to have a web application speak. However, it seems that you can only add utterances to a queue and then pause(), resume(), and cancel() the entire queue.

I have a situation where I want to have two utterances:

utterance1 = new SpeechSynthesisUtterance(text1);
utterance2 = new SpeechSynthesisUtterance(text2);

I would like to have utterance1 play, then pause it in the middle, have utterance2 play, and then resume utterance1. In code, it would look like this:

speechSynthesis.speak(utterance1);
// ... after a while
speechSyntehsis.pause(utterance1);
speechSynthesis.speak(utterance2);
// ... after a long while
speechSynthesis.resume(utterance1);

Unfortunately, speechSynthesis' methods pause(), resume(), and cancel() do not take any argument and act on the entire speech utterance queue. Is there any way to achieve this behavior?

If I could have multiple speechSynthesis objects, then I could create one for each utterance, but I believe I can only have one.

If I could keep track of where in the string the utterance has "been uttered to" then I could cancel it and then create a new utterance with the remainder of the text, but I don't know if that is possible.

Any suggestions?


Solution

  • I have already work in the speechSynthesis for a couple of months with my library Artyom.js , and according to the documentation (and all the tests that i've made ) pause a single synthesis instance and reanudate another is not possible because all the instances are related to the window.speechSynthesis (if someday the API changes, that will be another great step in the speechSynthesis). When you call the pause method of the speechSynthesis "instance", it will apply for all the queue and there's no other way.

    According to the documentation :

    // the only solution would be if the speechSynthesis official API had a constructor like
    // and a real NEW instance be created
    // var synthRealInstance = new speechSynthesis();
    // but till the date ... nope :(
    
    var synthA =  window.speechSynthesis;
    var synthB = window.speechSynthesis;
    
    var utterance1 = new SpeechSynthesisUtterance('How about we say this now? This is quite a long sentence to say.');
    var utterance2 = new SpeechSynthesisUtterance('We should say another sentence too, just to be on the safe side.');
    
    synthA.speak(utterance1);
    synthB.speak(utterance2);
    
    synthA.pause();
    // or synthB will anyway stop the synthesis of the queue
    

    There is a property on the utterance (onmark) however is not well documented and probably will not work as this api still experimental.

    The mark event is fired when a ‘mark’ tag is reached in a Speech Synthesis Markup Language (SSML) file. Just know that it’s possible to pass your speech data to an utterance using an XML-based SSML document. The main advantage of this being that it makes it easier to manage speech content when building applications that have large amount of text that need to be synthesised.

    Read more about here.