speech-recognition speech-to-text cmusphinx pocketsphinx

Sphinxtrain returns other results than pocketsphinx

I made it, finally. My WER (word error rate) is at 0 % after training. I have just a small dataset for simple voice recognoition (just for the words "yes" and "no" in another language). I trained with sphinxtrain (126 train files, 12 test files). The audiofiles have a length of ~5s and contains 8 words (mixed yes/no).

After training i decided to take my testfiles an run them through pocketsphinx. Nearly every file i tested had at least 1 word error. Sometimes it recognized 1-2 more words than expected. Sometimes it recognized a "yes" as a "no".

I'd like to know why im getting different results from sphinxtrain and pocketsphinx.
I'd also like to know how i can improve my results using pocketsphinx. (Especially the thing that pocketsphinx recognize one "no" as two "no"s.

Solution

I'd like to know why im getting different results from sphinxtrain and pocketsphinx.

You do not have enough training data.

I'd also like to know how i can improve my results using pocketsphinx. (Especially the thing that pocketsphinx recognize one "no" as two "no"s.

Use more training data.