Search code examples
javascripthtmlsynchronizationspeech-synthesiswebspeech-api

How to synchronize SpeechSynthesis and Text Color Changes in Web Project


I'm currently working on a web project using the SpeechSynthesis API to read paragraphs of text on a webpage. I've been trying to synchronize the spoken words with the color changes in the text, but I'm facing some challenges.

Here's a brief overview of the issue:

  • I have a function that reads out the content of

    tags on a page using the SpeechSynthesis API.

  • The goal is to synchronize the spoken words with color changes in real-time.
  • Specifically, I'd like each word to change to red while it's being spoken and revert to the original color when the word is completed.
  • every attempt led to the whole paragraph being red.

My working code without the sync is below.

function speakAllParagraphs(page) {
  // Get all <p> elements within the current page
  var paragraphs = document
    .getElementById("page" + page)
    .getElementsByTagName("p");

  // Iterate through each <p> tag
  Array.from(paragraphs).forEach(function (paragraph, index) {
    // Speak the text of the paragraph
    var text = paragraph.innerText;

    // Create a new SpeechSynthesisUtterance
    var utterance = new SpeechSynthesisUtterance();
    utterance.text = text;

    // Find the voice by name
    const voices = speechSynthesis.getVoices();
    const targetVoice = voices.find(
      (voice) =>
        voice.name === "Microsoft Emily Online (Natural) - English (Ireland)"
    );

    if (targetVoice) {
      utterance.voice = targetVoice;
    } else {
      // Fallback: if the target voice is not available, use default voice
      utterance.voice = voices[0];
    }

    // Play the synthesized speech
    speechSynthesis.speak(utterance);
  });
}
  • I attempted to use the onboundary event to change the color of individual words, but it didn't work as expected.
  • I've tried a few approaches, including using timers and events, but I haven't been able to achieve the desired synchronization.
  • every attempt led to the whole paragraph being red.
  • The goal is to have each word change to red while it's being spoken, and revert to the original color when the word is completed.

Solution

  • You can accomplish the required behavior by listening for the boundary event on the SpeechSynthesisUtterance instance. This event will give you a charIndex and charLength property which will indicidate where in the string the utterance is at that specific moment.

    This allows you to grab a specific part from your string and wrap it in HTML - like a <mark> tag in the example below - to highlight the current spoken text. Replace the text of the paragraph with the text with that includes the highlight.

    Also listen for the end event to restore the original text in the paragraph when the utterance is finished.

    const target = document.querySelector('p');
    
    function speakAndHighlightText(target) {
      const text = target.textContent;
      const utterance = new SpeechSynthesisUtterance();
      utterance.text = text;
      
      utterance.addEventListener('boundary', ({ charIndex, charLength }) => {    
        const beforeWord = text.slice(0, charIndex);
        const word = text.slice(charIndex, charIndex + charLength);
        const afterWord = text.slice(charIndex + charLength, text.length);
        
        target.innerHTML = `${beforeWord}<mark>${word}</mark>${afterWord}`
      });
      
      utterance.addEventListener('end', event => {
        target.textContent = text;
      });
      
      speechSynthesis.speak(utterance);
    }
    
    speakAndHighlightText(target);
    mark {
      background-color: red;
    }
    <html lang="en">
      <p>"Sunsets paint the sky with hues of warmth, a daily masterpiece. Nature's farewell kiss, fleeting yet timeless, whispers serenity to all."</p>
    </html>

    I've also included a snippet which handles multiple paragraphs. The difference here is that speakAndHighlightText returns a Promise that resolves on the end event, which boils down to that we can await the speech to finish before moving to the next paragraph.

    const targets = document.querySelectorAll('p');
    
    const speakAndHighlightText = (target) => new Promise(resolve => {
      const text = target.textContent;
      const utterance = new SpeechSynthesisUtterance();
      utterance.text = text;
      
      utterance.addEventListener('boundary', ({ charIndex, charLength }) => {    
        const beforeWord = text.slice(0, charIndex);
        const word = text.slice(charIndex, charIndex + charLength);
        const afterWord = text.slice(charIndex + charLength, text.length);
        
        target.innerHTML = `${beforeWord}<mark>${word}</mark>${afterWord}`
      });
      
      utterance.addEventListener('end', event => {
        target.textContent = text;
        resolve(target);
      });
      
      speechSynthesis.speak(utterance);
    });
    
    (async () => {
      for (const target of targets) {
        await speakAndHighlightText(target);
      }
      
      console.log('Finished speaking');
    })();
    mark {
      background-color: red;
    }
    <html lang="en">
      <p>"Sunsets paint the sky with hues of warmth, a daily masterpiece. Nature's farewell kiss, fleeting yet timeless, whispers serenity to all."</p>
      
      <p>Raindrops dance on leaves, a liquid symphony. Earth sighs in relief, embracing each droplet's embrace. Nature's lullaby, calming and pure.</p>
    </html>