statistics signal-processing speech-recognition hidden-markov-models

Hidden Markov Models - Identifying Phonemes

I'm developing a project that identifies Phonemes to be able to identify whether someone is saying either "Yes" or "No".

So far in the project, I have used Zero-crossings to identify what the person is saying, this works really well and seems simple enough to understand. The project, however, needs a few enhancements and has to be developed using a Hidden Markov Model.

My question is this:

I want to develop a Hidden Markov Model, without erasing the work that I have already completed. I.e. I strip the data that do not warrant consideration by counting the number of zero-crossings as well as the summation of the blocks.

I do not understand what data I would need to train the HMM in order to be able to identify these Phonemes. E.g.

With Zero-crossings I have identifies that:

Yes - Zero-crossings start low and then the value increases

No - Zero-crossings start low and then do not increase with value.

Could I train my HMM algorithm so that it interprets these values?

Or could anyone suggest a method of which I can train the HMM to be able to identify the word that is inputted in the sample?

Hope someone can help :)!

Solution

Could I train my HMM algorithm so that it interprets these values?

Yes, definitely

Or could anyone suggest a method of which I can train the HMM to be able to identify the word that is inputted in the sample?

You just need to put zero crossing rate in a feature file together with MFCC features like 14th feature and use any standard HMM training toolkit like CMUSphinx or HTK to train the HMM and decode using it. For more information see

http://cmusphinx.sourceforge.net/wiki/mfcformat

http://speech-research.com/htkSearch/index.php?ID=297039

http://speech-research.com/SRTxt2User/index.html