node.js speech-recognition speech-to-text

Speech recognition, nodeJS

I'm currently working on a tool allowing me to read all my notifications thanks to the connection to different APIs.

It's working great, but now I would like to put some vocal commands to do some actions.

Like when the software is saying "One mail from Bob", I would like to say "Read it", or "Archive it".

My software is running through a node server, currently I don't have any browser implementation, but it can be a plan.

What is the best way in node JS to enable speech to text?

I've seen a lot of threads on it, but mainly it's using the browser and if possible, I would like to avoid that at the beginning. Is it possible?

Another issue is some software requires the input of a wav file. I don't have any file, I just want my software to be always listening to what I say to react when I say a command.

Do you have any information on how I could do that?

Cheers

Solution

Both of the answers here already are good, but what I think you're looking for is Sonus. It takes care of audio encoding and streaming for you. It's always listening offline for a customizable hotword (like Siri or Alexa). You can also trigger listening programmatically. In combination with a module like say, you could enable your example by doing something like:

say.speak('One mail from Bob', function(err) {
  Sonus.trigger(sonus, 1) //start listening
});

You can also use different hotwords to handle the subsequent recognized speech in a different way. For instance:
"Notifications. Most recent." and "Send message. How are you today"

Throw that onto a Pi or a CHIP with a microphone on your desk and you have a personal assistant that reads your notifications and reacts to commands.

Simple Example:
https://twitter.com/_evnc/status/811290460174041090

Something a bit more complex:
https://youtu.be/pm0F_WNoe9k?t=20s

Full documentation:
https://github.com/evancohen/sonus/blob/master/docs/API.md

Disclaimer: This is my project :)