My question is related to the Backward algorithm.
The recursive formula for the algorithm is as follows:
sigma j = 1 to N (t+1(j) * aij * bj(Ot+1))
Where t+1(j) is the recursive element, aij is the transition probability from state i to j, and bj(Ot+1) is the omission probability of the observation O at time t+1.
Given the above, when I start calculating the backward probabilities, it seems that it does not matter what the first observation is, since the observation at time t is not taken into consideration for calculation of the corresponding backward probability.
For Example, for a sequence of observations A,T,G,A, it does not matter what the first observation , i.e, A in this case, is since its emission probability is not taken into consideration in the backward algorithm.
I just want to know if my intuition is correct. Otherwise please point me to some reference or explanation which will clarify my doubt.
Given the above, when I start calculating the backward probabilities, it seems that it does not matter what the first observation is, since the observation at time t is not taken into consideration for calculation of the corresponding backward probability.
That's correct.
b[t](i)
is the probability of observing the sequence O[t+1], O[t+2], ..., O[T]
given the hidden state i
at the time t
.
One way to think about it is this. The hidden state at the time t
independently affects two things: the observation O[t]
, and the probability distribution of the subsequent states and observations. So, once the state is fixed, O[t]
and O[t+1], O[t+2], ..., O[T]
are conditionally independent (given the value of that state). That's why O[t]
doesn't appear in the calculation of b[t](i)
.