Search code examples
javamallethidden-markov-models

Mallet HMM Training Problems


I am struggling at the moment with Mallet's ridiculously poor documentation regarding HMMs. I have managed to import the data into instances(adapted from the ImportExample.java snippet) and I was just wondering how they can be used to train an HMM model. I first started by creating an HMM instance but I wasn't sure whether to go for:

    HMM hmm = new HMM(instances.getDataAlphabet(), instances.getTargetAlphabet());

Or use the same data alphabet twice like so:

    HMM hmm = new HMM(instances.getDataAlphabet(), instances.getDataAlphabet());

Either way when I get to

    hmm.train(instances);

I get the following error:

cc.mallet.types.FeatureVector cannot be cast to cc.mallet.types.FeatureVectorSequence

I would be grateful for any help you can provide.

Cheers


Solution

  • I have managed to solve this particular problem and thought it may be useful to others with the same problem. There is a solution within the examples package in mallet: http://hg-iesl.cs.umass.edu/hg/mallet/file/83adf71b0824/src/cc/mallet/examples/TrainHMM.java

    The main problem was related to how you imported the data through the pipe. Also from what I can tell it helps if you data is in this format:

    TOKEN  TAG 
    TOKEN  TAG
    

    I assuming you can have features in between the TOKEN and TAG but am not a 100% sure. If anyone knows of any good examples and documentation about using HMM within mallet, please let me know.