Search code examples
speech-recognitionpocketsphinx

Sphinxtrain senone.c error and pocketsphinx_continuous bin_mdef.c error


As I'm building a sinhala speech recognition system using pocketsphinx I have come across two major error while running sphinxtrain run command and pocketsphinx_continuous command my project folder can be seen HERE. Still I'm using small data set and currently I'm in the process of recording some more words. After running sphinxtrain run command I have copied following files to pocketsphinx language model default location /usr/local/share/pocketsphinx/model/en-us/ by creating a folder call si,

  • mdef
  • feat.params
  • mixture_weights
  • means
  • noisedict
  • transition_matrices
  • variances
  • sinhala.dic
  • sinhala.lm
  • sinhala.phone

Then I ran pocketsphinx_continuous command and the Errors I got are HERE.

  1. For sinhala language It is very dificult to redice number of phones specially below 255. Is their any solution for that?
  2. Why I'm getting senone.c error mentioned in the logs? and How to correct it?
  3. Does SRILM support to create .lm.bin files for sinhala language?

Solution

  • sphinxtrain run command and pocketsphinx_continuous command my project folder can be seen HERE.

    It is better to use more user-friendly websites for sharing like google drive or dropbox. It is not polite to ask people to use websites with spam and adware.

    For sinhala language It is very dificult to redice number of phones specially below 255. Is their any solution for that?

    Use smaller phoneset. According to the paper

    http://www.panl10n.net/english/final%20reports/pdf%20files/Sri%20Lanka/SRI04.pdf

    you can use just 40 phonemes

    Why I'm getting senone.c error mentioned in the logs? and How to correct it?

    You are using too many phonemes, use smaller phonemes

    Does SRILM support to create .lm.bin files for sinhala language?

    No, you can use LM created with SRILM directly without conversion to lm.bin