Search code examples
markov-chainsdeterministic

Avoiding Determinism with Markovian Logic


I just began reading more about Markov Chain Generators today, and am really intrigued by the whole process of building one. From my understanding, the future state is dependent upon the statistical past states to the present.

Example:

Hello World. Hello Dolly. Hello World.

"World" follows "Hello" ~66% of the time in that source.

If that is always the case, how then do you avoid out-putting the same results each time? The statistical-occurrences won't change with a static string, so am I right to assume that no variants will every be generated, unless the source-data is modified in some way?

How could I get variations from a static-source, considering the statistical-values, yet allowing some flexibility? Using my example above, how do I allow my generator to follow "Hello" with "Dolly," when "Dolly" only follows "Hello" 33% of the time?

I guess what I'm asking is, How do I base the probability of my next selection on the statistical presence of the words that follow my present selection? That way, "Dolly" shows up 33% of the time, and "World" shows up 66% of the time - or am I completely lost here?


Solution

  • You use a random number generator to pick which path you go down. You have to save each state (which is really a history of N previous items) and the probabilities for that state. Then you pick a random number and decide based on it what the next state you transition to is.

    In your example you have a Markov chain with an N of 1 you would have a chain structure that looked something like this:

    <start> -> Hello : 1.0
    
    Hello -> World. : 0.66666
    Hello -> Dolly. : 0.33333
    
    Dolly. -> Hello : 1.0
    
    World. -> <end> : 0.5
    World. -> Hello : 0.5
    

    If your current state is Hello, then your next possible states are World. and Dolly.. Generate a random number between 0 and 1 and choose World. if it's less than 0.666666, otherwise choose Dolly.

    With an N=2 Markov chain, you get almost deterministic behavior with that input:

    <start> <start> -> <start> Hello : 1.0
    
    <start> Hello -> Hello World. : 1.0
    
    Hello World. -> World. Hello : 0.5
    Hello World. -> World. <end> : 0.5
    
    World. Hello -> Hello Dolly. : 1.0
    
    Hello Dolly. -> Dolly. Hello : 1.0
    
    Dolly. Hello -> Hello World. : 1.0