Search code examples
c#speech-recognitiongrammarmicrosoft-speech-platform

Too many grammars in Microsoft Speech SDK 11


I write simple speech recognition app wich can load grammars into engine.

But i see, that can not load many grammars into engine not more then 1024 grammars.

Additional information: Too many grammars have been loaded. Number of grammars cannot exceed 1024.

And when i load 1024 grammars- it does not recognize input stream .wav (and my speech) file:

 Thread.CurrentThread.CurrentCulture = new CultureInfo("ru-RU");
        Thread.CurrentThread.CurrentUICulture = new CultureInfo("ru-RU");

        // Create a new SpeechRecognitionEngine instance.
         _sre = new SpeechRecognitionEngine(new System.Globalization.CultureInfo("ru-RU"));

         _sre.SpeechHypothesized += _sre_SpeechHypothesized;
         _sre.SpeechDetected += _sre_SpeechDetected;
         _sre.SetInputToWaveFile(@"c:\Test\Wavs\Wavs-converted\file.wav");


 public void LoadGrammarIntoEngine(IEnumerable<String> textColl)
    {
        Choices choises = new Choices();
        GrammarBuilder gb = new GrammarBuilder();
        gb.Culture = new CultureInfo("ru-RU");


        if (choises != null && textColl != null)
        {
            choises.Add(textColl.ToArray());

            if (gb != null)
                gb.Append(choises);
            else
            {
                Console.WriteLine();
            }

            if (_sre.Grammars.Count < 1024)
            {
                Grammar g = new Grammar(gb);
                if (_sre != null && g != null)
                    _sre.LoadGrammar(g);
                else
                {
                    Console.WriteLine();
                }
            }
            else
            {
               Console.WriteLine("too many grammars");
            }
        }

    }

P.S. when i load 5-10 grammars (100 words each)- it works well.

Maybe i can\should use more than one recognition engine together?


Solution

  • From the comments, you're fundamentally taking the wrong approach. You need to be using something like System.Speech.Recognition.DictationGrammar - which uses the Microsoft desktop SR engine.

    This will accept most English words. If you need to limit it to 1000 words, there are a couple of options.

    If your word list contains words that aren't in the default word list (which is quite extensive), you can use the Lexicon Interfaces, which, sadly, aren't exposed through System.Speech.Recognition, so you'll have to drop to SAPI to use them.

    This also assumes that you can reject out-of-vocabulary recognitions. If that's not the case, you can use the Dictation Resource Kit, which will let you build a custom language model; be warned that it's built by speech scientists for speech scientists, so the documentation is pretty hard going.

    In practice, users will say out-of-vocabulary things; it's best to check for them and reject them. Small (and yes, 1000 words is small) vocabularies tend to have problems with false positives (user says something out-of-vocabulary that gets recognized as something in-vocabulary). This happens with command-and-control grammars, as well.