algorithm machine-learning mathematical-optimization n-gram markov

"Anagram solver" based on statistics rather than a dictionary/table?

My problem is conceptually similar to solving anagrams, except I can't just use a dictionary lookup. I am trying to find plausible words rather than real words.

I have created an N-gram model (for now, N=2) based on the letters in a bunch of text. Now, given a random sequence of letters, I would like to permute them into the most likely sequence according to the transition probabilities. I thought I would need the Viterbi algorithm when I started this, but as I look deeper, the Viterbi algorithm optimizes a sequence of hidden random variables based on the observed output. I am trying to optimize the output sequence.

Is there a well-known algorithm for this that I can read about? Or am I on the right track with Viterbi and I'm just not seeing how to apply it?

Update

I have added a bounty to ask for more insight into this problem. (Analysis explaining why an efficient approach isn't possible, other heuristics/approximations besides simulated annealing, etc.)

Solution

If I understand your problem correctly, you are searching all permutations of letters in a word for the one with the lowest product of 2-gram probabilities.

If your word is too long to simply brute force all combinations, I've found that stochastic optimization algorithms produce good results in a short time. I (having a mathematical background) have done some work on the algorithm "Simulated Annealing", which I think would fit nicely to your problem. And it is pretty easy to implement.