python machine-learning nltk hidden-markov-models

initialize HiddenMarkovModelTrainer object

i'm doing gesture recognition in python and one the the more complete library i've found that can manage Hidden Markov Model is nltk. But there is something that i can't understand.

First of all, the data. I have coordinates of the gesture and i have clustized them in 8 cluster (with k-means). so this is my gesture structure:

raw coordinates x,y: [[123,16], [120,16], [115,16], [111,16], [107,16], [103,17], ...]

centroids x,y : [[ 132.375        56.625     ]
                 [ 122.45454545   30.09090909]
                 [  70.5          27.33333333]
                 ...]

labels: [5 6 6 6 6 6 6 2 2 2 2 2 2 4 4 4 ...]

Now i want to train an HMM with Baum-Welch with my . So HiddenMarkovModelTrainer is my class.

I've found in internet some few more implementations of baum welch, but only in Matlab. the implementation of this algorithms tipically need this input:

baum-welch(X, alphabet, H)

where - X is the data for train (in my case - labels) - alphabet the possible values of the data (in my case - 0,1,2,3,4,5,6,7) - H the number of hidden states

Now i am confused because in ntlk.HiddenMarkovModelTrainer constructor i have to give states and symbols and i don't know what they should be, considering that the data to train X is the imput of HiddenMarkovModelTrainer.train_unsupervised() method i think that my alphabet is symbol.. i don't know what to put in states.

i hope my explanation is clear even if my english is poor.

Solution

Hidden Markov Models are called so because their actual states are not observable; instead, the states produce an observation with a certain probability. The classical use of HMMs in the NLTK is POS tagging, where the observations are words and the hidden internal states are POS tags. Look at this example to understand what the states and symbols parameters mean in this case.

For gesture recognition with HMM, the observations are temporal sequences of some kind of feature modeling (symbols) of the geometrical input data - in your case, you use clustering (also called "zoning" - see section 3.2 of this paper ("Yang, Xu. Hidden Markov Model for Gesture Recognition") for some other possible models). To my understanding, the set of internal states doesn't have any meaningful interpretation. The number of internal states used in the training of an HMM for each gesture is simply a parameter that has to be experimented with. For an example, see this paper ("Yamato, Ohya, Ishii. Recognizing Human Action in Time-Sequential Images using HMM") - the number of states is set to 36, which is criticized as being too high in this master thesis, just to cite an example of this being a modifiable parameter.

So I would try it with this code:

observed_sequence = [5, 6, 6, 6, 6, 6, 6, 2, 2, 2, 2, 2, 2, 4, 4, 4]
states = range(20) # experiment with this number
symbols = set(observed_clusters)
trainer = HiddenMarkovModelTrainer(states, symbols)
model = trainer.train_unsupervised([observed_sequence])