Search code examples
speech-recognitioncmusphinxsphinx4

Which language model to use for dictation


I intend to use the sphinx4 in dictation mode but I have a question about the language model. My application will have a very large vocabulary, ie, it can use all the English words and I do not know which will be the phrases that will be said. So which model language should I use? Sphinx4 have any specific language model for these cases?


Solution

  • You should use US English Generic language model available in downloads:

    https://sourceforge.net/projects/cmusphinx/files/Acoustic%20and%20Language%20Models/US%20English%20Generic%20Language%20Model/