Search code examples
rmachine-learninghidden-markov-models

Hmm training with multiple observations and mhsmm package in R


i wanted to train a new hmm model, by means of Poisson observations that are the only thing i know. I'm using the mhsmm package for R.

The first thing that bugs me is the initialization of the model, in the examples is:

J<-3
initial <- rep(1/J,J)
P <- matrix(1/J, nrow = J, ncol = J)
b <- list(lambda=c(1,3,6))
model = hmmspec(init=initial, trans=P, parms.emission=b,dens.emission=dpois.hsmm)

in my case i don't have initial values for the emission distribution parameters, that's what i want to estimate. How?

Secondly: if i only have observations, how do i pass them to

h1 = hmmfit(list_of_observations, model ,mstep=mstep.pois)

in order to obtain the trained model? list_of_observations, in the examples, contains a vector of states, one of observations and one of observation sequence length and is usually obtained by a simulation of the model:

list_of_observations = simulate(model, N, rand.emis = rpois.hsmm)

EDIT: Found this old question with an answer that partially solved my problem: MHSMM package in R-Input Format? These two lines did the trick:

train <- list(x = data.df$sequences, N = N)
class(train) <- "hsmm.data"

where data.df$sequences is the array containing all observations sequences and N is the array containing the count of observations for each sequence. Still, the initial model is totally random, but i guess this is the way it is meant to be since it will be re-estimated, am i right?


Solution

  • The problem of initialization is critical not only for HMMs and HSMMs, but for all learning methods based on a form of the Expectation-Maximization algorithm. EM converges to a local optimum in terms of likelihood between model and data, but that does not always guarantee to reach the global optimum.