Search code examples
javascripttext-to-speechwebspeech-apispeechsynthesizer

Improving pronunciation when using the SpeechSynthesis Interface of the Web Speech API


I'm writing a front-end User Interface which uses the SpeechSynthesis Interface of the Web Speech API.

I'm generally happy with it. I'm not concerned whether the words are pronounced in an American, British, Australian, Kiwi, South African, Indian, Singlish, Filipino etc. accent but I'm uncertain how to approach things when the SpeechSynthesis Interface mangles words.

The only solution I've come up with so far is to replace:

  • a single string (where the Speech Synthesizer speaks the same string as the user reads)

with:

  • a pair of strings (where the Speech Synthesizer speaks a different string - a designated counterpart - to the string which the user reads)

Example:

const buttons = document.querySelectorAll('button');

const speakWord = (e) => {
  
  const word = e.target.dataset.word;
  let utterance = new SpeechSynthesisUtterance(word);
  speechSynthesis.speak(utterance);
}

buttons.forEach((button) => {
  button.addEventListener('click', speakWord, false);
});
h2 {
  display: inline-block;
  margin-right: 12px;
  font-size: 14px;
}

button {
  cursor: pointer;
}
<h2>Say:</h2>
<button type="button" data-word="manifests">"manifests"</button>
<button type="button" data-word="modules">"modules"</button>
<button type="button" data-word="configuration">"configuration"</button>

<br>

<h2>Now say:</h2>
<button type="button" data-word="protocols">"protocols"</button>
<button type="button" data-word="web app">"web app"</button>

<br>

<h2>Finally say:</h2><button type="button" data-word="proto kols">"proto kols"</button>
<button type="button" data-word="weh bapp">"weh bapp"</button>

Short of creating words in pairs and telling the Speech Synthesizer to speak "proto kols" and "weh bapp" whenever the words "protocols" and "web app" are displayed to the user, are there any other approaches I can use to override the mangling?


Solution

  • I had the same problem. I tried setting utterance.rate parameter to lower value. For english, the understandability was better when rate was set to 0.5. You can set the parameter to lower value if pronunciation speed is not too slow.

    const u = new SpeechSynthesisUtterance('web app');
    speechSynthesis.speak(u);
    
    u.rate = 0.5;
    speechSynthesis.speak(u);