Search code examples
c#.netspeech-recognitionsapispeech-to-text

System.Speech.Recognition Choosing Recognition Profile


Does anyone know how to change recognition profiles from within a .NET application?

I am writing a .NET application that does speech recognition using the capabilities found in the System.Speech.Recognition namespace. The audio that I am feeding into the system comes from multiple different users. I would like to be able to train the system to more accurately recognize speech from each of the different users.

I have found the Speech Recognition control panel in windows (Windows 7 in this case) where I can configure training profiles. Setting up a profile for myself and doing the training process significantly improved the accuracy of the recognition. So I could setup profiles for every user and have them do the training process, but then I need to be able to select the right profile in my application.

My application is a "server" that receives audio streams from one or more users at a time and performs the speech recognition. So I need to be able to specify which recognition profile to use programmatically for each instance of the recognition engine that my application creates. This is not a single user application, so I can't just have them select their profile from the Windows control panel.


Solution

  • I don't see a way to do it via System.Speech.Recognition, but you can do it via speechlib (the SAPI IDispatch-compatible API). Look at ISpeechRecognizer::Profile.

    To set the profile, you will need to add

    using SpeechLib;
    

    to your code, along with System.Speech.Recognition.

    The tricky part would be getting the profile that you set via SpeechLib to 'stick' while you're creating the System.Speech.Recognition.RecognitionEngine. I'd probably set the profile to be default (via SpeechLib), create the RecognitionEngine, and reset the default profile.

    (I'm assuming that you're not planning to use the shared recognizer, which won't work in a multiuser scenario.)

    You'll probably need a critical section to make sure that only one thread can create the RecognitionEngine at a time.