I'm developing a project that identifies Phonemes to be able to identify whether someone is saying either "Yes" or "No".
So far in the project, I have used Zero-crossings to identify what the person is saying, this works really well and seems simple enough to understand. The project, however, needs a few enhancements and has to be developed using a Hidden Markov Model.
My question is this:
I want to develop a Hidden Markov Model, without erasing the work that I have already completed. I.e. I strip the data that do not warrant consideration by counting the number of zero-crossings as well as the summation of the blocks.
I do not understand what data I would need to train the HMM in order to be able to identify these Phonemes. E.g.
With Zero-crossings I have identifies that:
Yes - Zero-crossings start low and then the value increases
No - Zero-crossings start low and then do not increase with value.
Could I train my HMM algorithm so that it interprets these values?
Or could anyone suggest a method of which I can train the HMM to be able to identify the word that is inputted in the sample?
Hope someone can help :)!
Could I train my HMM algorithm so that it interprets these values?
Yes, definitely
Or could anyone suggest a method of which I can train the HMM to be able to identify the word that is inputted in the sample?
You just need to put zero crossing rate in a feature file together with MFCC features like 14th feature and use any standard HMM training toolkit like CMUSphinx or HTK to train the HMM and decode using it. For more information see
http://cmusphinx.sourceforge.net/wiki/mfcformat
or