I am currently trying to implement a Baum Welch algorithm in C, but I run into the following problem : the gamma function :
gamma(i,t) = alpha(i,t) * beta(i,t) / sum over `i` of(alpha(i,t) * beta(i,t))
Unfortunately, for large enough observation sets, alpha drops rapidly to 0 as t
increases, and beta drops rapidly to 0 as t
decreases, meaning that, due to rounding down, there is never a spot where both alpha and beta are non-zero, which makes things rather problematic.
Is there a way around this problem or should I just try to increase precision for the values? I fear the problem may just pop up again if I try this approach, as alpha and beta drop of about one order of magnitude per observation.
You should do these computations, and generally all computations for probability models, in log-space:
lg_gamma(i, t) = (lg_alpha(i, t) + lg_beta(i, t)
- logsumexp over i of (lg_alpha(i, t) + lg_beta(i, t)))
where lg_gamma(i, t)
represents the logarithm of gamma(i, t)
, etc., and logsumexp
is the function described here. At the end of the computation, you can convert to probabilities using exp
, if needed (that's typically only needed for displaying probabilities, but even there logs may be preferable).
The base of the logarithm is not important, as long as you use the same base everywhere. I prefer the natural logarithm, because log
saves typing compared to log2
:)