Search code examples
rmachine-learninghidden-markov-models

Learning hidden markov model in R


A hidden Markov model (HMM) is one in which you observe a sequence of observations, but do not know the sequence of states the model went through to generate the observations. Analyses of hidden Markov models seek to recover the sequence of hidden states from the observed data.

I have data with both observations and hidden states (observations are of continuous values) where the hidden states were tagged by an expert. I would like to train a HMM that would be able - based on a (previously unseen) sequence of observations - to recover the corresponding hidden states.

Is there any R package to do that? Studying the existing packages (depmixS4, HMM, seqHMM - for categorical data only) allows you to specify a number of hidden states only.

EDIT:

Example:

data.tagged.by.expert = data.frame(
    hidden.state = c("Wake", "REM", "REM", "NonREM1", "NonREM2", "REM", "REM", "Wake"),
    sensor1 = c(1,1.2,1.2,1.3,4,2,1.78,0.65),
    sensor2 = c(7.2,5.3,5.1,1.2,2.3,7.5,7.8,2.1),
    sensor3 = c(0.01,0.02,0.08,0.8,0.03,0.01,0.15,0.45)
 )

data.newly.measured = data.frame(
    sensor1 = c(2,3,4,5,2,1,2,4,5,8,4,6,1,2,5,3,2,1,4),
    sensor2 =  c(2.1,2.3,2.2,4.2,4.2,2.2,2.2,5.3,2.4,1.0,2.5,2.4,1.2,8.4,5.2,5.5,5.2,4.3,7.8),
    sensor3 = c(0.23,0.25,0.23,0.54,0.36,0.85,0.01,0.52,0.09,0.12,0.85,0.45,0.26,0.08,0.01,0.55,0.67,0.82,0.35)
 )

I would like to create a HMM with discrete time t whrere random variable x(t) represents the hidden state at time t, x(t) {"Wake", "REM", "NonREM1", "NonREM2"}, and 3 continuous random variables sensor1(t), sensor2(t), sensor3(t) representing the observations at time t.

model.hmm = learn.model(data.tagged.by.user)

Then I would like to use the created model to estimate hidden states responsible for newly measured observations

hidden.states = estimate.hidden.states(model.hmm, data.newly.measured)

Solution

  • Data (training/testing)

    To be able to run learning methods for Naive Bayes classifier, we need longer data set

    states = c("NonREM1", "NonREM2", "NonREM3", "REM", "Wake")
    artificial.hypnogram = rep(c(5,4,1,2,3,4,5), times = c(40,150,200,300,50,90,30))
    
    data.tagged.by.expert = data.frame(
        hidden.state = states[artificial.hypnogram],
        sensor1 = log(artificial.hypnogram) + runif(n = length(artificial.hypnogram), min = 0.2, max = 0.5),
        sensor2 = 10*artificial.hypnogram + sample(c(-8:8), size = length(artificial.hypnogram), replace = T),
        sensor3 = sample(1:100, size = length(artificial.hypnogram), replace = T)
    )
    
    hidden.hypnogram = rep(c(5,4,1,2,4,5), times = c(10,10,15,10,10,3))
    data.newly.measured = data.frame(
        sensor1 = log(hidden.hypnogram) + runif(n = length(hidden.hypnogram), min = 0.2, max = 0.5),
        sensor2 = 10*hidden.hypnogram + sample(c(-8:8), size = length(hidden.hypnogram), replace = T),
        sensor3 = sample(1:100, size = length(hidden.hypnogram), replace = T)
    )
    

    Solution

    In the solution, we used Viterbi algorithm - combined with Naive Bayes classifier.

    At each clock time t, a Hidden Markov Model consist of

    • an unobserved state (denoted as hidden.state in this case) taking a finite number of states

      states = c("NonREM1", "NonREM2", "NonREM3", "REM", "Wake")
      
    • a set of observed variables (sensor1, sensor2, sensor3 in this case)

    Transition matrix

    A new state is entered based upon a transition probability distribution (transition matrix). This can be easily computed from data.tagged.by.expert e.g. using

    library(markovchain)
    emit_p <- markovchainFit(data.tagged.by.expert$hidden.state)$estimate
    

    Emission matrix

    After each transition is made, an observation (sensor_i) is produced according to a conditional probability distribution (emission matrix) which depends on the current state H of hidden.state only. We will replace emmision matrices by Naive Bayes classifier.

    library(caret)
    library(klaR)
    library(e1071)
    
    model = train(hidden.state ~ .,
              data = data.tagged.by.expert,
              method = 'nb',
              trControl=trainControl(method='cv',number=10)
              )
    

    Viterbi algorithm

    To solve the problem, we use Viterbi algorithm with the initial probability of 1 for "Wake" state and 0 otherwise. (We expect the patient to be awake in the beginning of the experiment)

    # we expect the patient to be awake in the beginning
    start_p = c(NonREM1 = 0,NonREM2 = 0,NonREM3 = 0, REM = 0, Wake = 1)
    
    # Naive Bayes model
    model_nb = model$finalModel
    
    # the observations
    observations = data.newly.measured 
    nObs <- nrow(observations) # number of observations 
    nStates <- length(states)  # number of states
    
    # T1, T2 initialization
    T1 <- matrix(0, nrow = nStates, ncol = nObs) #define two 2-dimensional tables
    row.names(T1) <- states
    T2 <- T1
    
    Byj <- predict(model_nb, newdata = observations[1,])$posterior
    # init first column of T1
    for(s in states)
      T1[s,1] = start_p[s] * Byj[1,s]
    
    # fill T1 and T2 tables
    for(j in 2:nObs) {
      Byj <- predict(model_nb, newdata = observations[j,])$posterior
      for(s in states) {
        res <- (T1[,j-1] * emit_p[,s]) * Byj[1,s] 
        T2[s,j] <- states[which.max(res)]
        T1[s,j] <- max(res)
      }
    }
    
    # backtract best path
    result <- rep("", times = nObs)
    result[nObs] <- names(which.max(T1[,nObs]))
    for (j in nObs:2) {
      result[j-1] <- T2[result[j], j]
    }
    
    # show the result
    result
    # show the original artificial data 
    states[hidden.hypnogram]
    

    References

    To read more about the problem, see Vomlel Jiří, Kratochvíl Václav : Dynamic Bayesian Networks for the Classification of Sleep Stages , Proceedings of the 11th Workshop on Uncertainty Processing (WUPES’18), p. 205-215 , Eds: Kratochvíl Václav, Vejnarová Jiřina, Workshop on Uncertainty Processing (WUPES’18), (Třeboň, CZ, 2018/06/06) [2018] Download