Search code examples
javascriptnode.jsdiscord.jstext-to-speech

JS Speech Synthesis to Node.JS Readable Stream


JavaScript's built-in text-to-speech function is SpeechSynthesis.speak(). I'm using discord.js and I want to convert Speech Synthesis to a node.js Readable Stream so I can play it back as a broadcast to a voice channel.

I did find this Github Repo, however it didn't work for me and I had a hard time broadcasting it (Also, while I'm not 100% certain, I'm assuming it records the tts, which isn't great when dealing with large strings of text).

Here's the main errors I got while using the aforementioned code:

The AudioContext was not allowed to start. It must be resumed (or created) after a user gesture on the page.

and

Uncaught TypeError: Cannot read property 'getUserMedia' of undefined

My goal is to avoid something like Google's tts API and just use native JavaScript. Is it at all possible to convert Speech Synthesis to a Readable Stream that I can use in discord.js? If so, how? Or, is there a way to use the previous repo? Please help me out, it would be much appreciated.

(also, I'm aware discord has a built-in tts button for reading messages - this is for something entirely different)


Solution

  • the Web Speech API (SpeechSynthesis.speak()) uses the underlying OS or Browser synthesis implementation, and doesn't go through the Web Audio API. That Github repo actually uses your system's Microphone to record the speech output. That's not a good idea except for as a hack. You'll need to use something else to generate the content - perhaps Say.js which is cross browser and works directly in node.js?