Search code examples
node.jsspeech-recognitionspeech-to-text

Speech recognition, nodeJS


I'm currently working on a tool allowing me to read all my notifications thanks to the connection to different APIs.

It's working great, but now I would like to put some vocal commands to do some actions.

Like when the software is saying "One mail from Bob", I would like to say "Read it", or "Archive it".

My software is running through a node server, currently I don't have any browser implementation, but it can be a plan.

What is the best way in node JS to enable speech to text?

I've seen a lot of threads on it, but mainly it's using the browser and if possible, I would like to avoid that at the beginning. Is it possible?

Another issue is some software requires the input of a wav file. I don't have any file, I just want my software to be always listening to what I say to react when I say a command.

Do you have any information on how I could do that?

Cheers


Solution

  • Both of the answers here already are good, but what I think you're looking for is Sonus. It takes care of audio encoding and streaming for you. It's always listening offline for a customizable hotword (like Siri or Alexa). You can also trigger listening programmatically. In combination with a module like say, you could enable your example by doing something like:

    say.speak('One mail from Bob', function(err) {
      Sonus.trigger(sonus, 1) //start listening
    });
    

    You can also use different hotwords to handle the subsequent recognized speech in a different way. For instance:
    "Notifications. Most recent." and "Send message. How are you today"

    Throw that onto a Pi or a CHIP with a microphone on your desk and you have a personal assistant that reads your notifications and reacts to commands.

    Simple Example:
    https://twitter.com/_evnc/status/811290460174041090

    Something a bit more complex:
    https://youtu.be/pm0F_WNoe9k?t=20s

    Full documentation:
    https://github.com/evancohen/sonus/blob/master/docs/API.md

    Disclaimer: This is my project :)