Search code examples
voice-recognitioncmusphinxlanguage-model

How to use arpa file in voice recognition


I have created a ARPA file from a text file using CMU SLM toolkit.

Currently I don't know how to use the generated ARPA file in my project instead of .lm and .dic file.

If any one knows about that please let me know.


Solution

  • you use the probability of the language model when considering the "cost" of a word-transition in the search. :-) but that's probably not what you wanted to hear.

    Your question is too open-ended.. what is your specific problem?

    The dictionary and the language model are two separate items -- you can not convert one into the other.

    The dictionary is used to tell the search what the valid words are and how they relate to phonemes / the phonetic transcription.

    The language model is used during the recognition of an utterance, by using the probability of a uni-gram, bi-gram, n-gram .. when the search algorithm is considering a word-transition.

    Edit:

    check:

    http://www-speech.sri.com/projects/srilm/manpages/ngram-format.5.html

    http://www.ee.ucla.edu/~weichu/htkbook/node243_ct.html

    http://www.ling.ohio-state.edu/~bromberg/ngramcount/ngram2fsm.html