Search code examples
pythontensorflowmachine-learningtensorflow2.0hidden-markov-models

How to get HMM working with real-valued data in Tensorflow


I'm working with a dataset that contains data from IoT devices and I have found that Hidden Markov Models work pretty well for my use case. As such, I'm trying to alter some code from a Tensorflow tutorial I've found here. The dataset contains real-values for the observed variable compared to the count data shown in the tutorial.

In particular, I believe the following needs to be changed so that the HMM has Normally distributed emissions. Unfortunately, I can't find any code on how to alter the model to have a different emission other than Poisson.

How should I change the code to emit normally distributed values?

# Define variable to represent the unknown log rates.
trainable_log_rates = tf.Variable(
  np.log(np.mean(observed_counts)) + tf.random.normal([num_states]),
  name='log_rates')

hmm = tfd.HiddenMarkovModel(
  initial_distribution=tfd.Categorical(
      logits=initial_state_logits),
  transition_distribution=tfd.Categorical(probs=transition_probs),
  observation_distribution=tfd.Poisson(log_rate=trainable_log_rates),
  num_steps=len(observed_counts))

rate_prior = tfd.LogNormal(5, 5)

def log_prob():
 return (tf.reduce_sum(rate_prior.log_prob(tf.math.exp(trainable_log_rates))) +
         hmm.log_prob(observed_counts))

optimizer = tf.keras.optimizers.Adam(learning_rate=0.1)

@tf.function(autograph=False)
def train_op():
  with tf.GradientTape() as tape:
    neg_log_prob = -log_prob()
  grads = tape.gradient(neg_log_prob, [trainable_log_rates])[0]
  optimizer.apply_gradients([(grads, trainable_log_rates)])
  return neg_log_prob, tf.math.exp(trainable_log_rates)

Solution

  • @mCoding's answer is right, in the example posted in by Tensorflow, you have a Hidden Markov model with a uniform zero distribution ([0.,0.,0.,0.]), a heavy diagonal transition matrix, and the emission probabilities are Poisson distributed.

    In order to adapt it to your "Normal" example, you only have to change those probabilities to the Normal one. As an example, consider as a starting point that your emission probabilities are distributed Normal with parameters:

    training_loc =  tf.Variable([0.,0.,0.,0.])
    training_scale = tf.Variable([1.,1.,1.,1.])
    

    then your observation_distribution will be:

    observation_distribution = tfp.distributions.Normal(loc= training_loc, scale=training_scale )
    

    Finally, you also have to change your prior knowledge about these parameters, setting a prior_loc, prior_scale. You might want to consider uninformative/weakly informative priors as I see that you are fitting the model afterwards.

    So your code should be similar to:

    # Define the emission probabilities.
    training_loc =  tf.Variable([0.,0.,0.])
    training_scale = tf.Variable([1.,1.,1.])
    observation_distribution = tfp.distributions.Normal(loc= training_loc, scale=training_scale ) #Change this to your desired distribution
    
    hmm = tfd.HiddenMarkovModel(
      initial_distribution=tfd.Categorical(
          logits=initial_state_logits),
      transition_distribution=tfd.Categorical(probs=transition_probs),
      observation_distribution=observation_distribution,
      num_steps=len(observed_counts))
    
    # Prior distributions
    prior_loc = tfd.Normal(loc=0., scale=1.)
    prior_scale = tfd.HalfNormal(scale=1.)
    
    def log_prob():
      log_probability = hmm.log_prob(data)#Use your training data right here
      # Compute the log probability of the prior on the mean and standard deviation of the observation distribution
      log_probability += tf.reduce_sum(prior_mean.log_prob(observation_distribution.loc))
      log_probability += tf.reduce_sum(prior_scale.log_prob(observation_distribution.scale))
      # Return the negative log probability, since we want to minimize this quantity
      return log_probability 
    
    optimizer = tf.keras.optimizers.Adam(learning_rate=0.1)
    
    # Finally train the model like in the example
    
    losses = tfp.math.minimize(
        lambda: -log_prob(),
        optimizer=tf.optimizers.Adam(learning_rate=0.1),
        num_steps=100)
    

    So now if you look at your params training_loc and training_scale, they should have fitted values.