simulate markov chain using networkx and numpy

My goal is to simulate a Markov chain using networkx and numpy. I write the following code

import numpy as np
import networkx as nx


states = [
    'distance',
    'strike',
    'knockout'
]


transition_matrix = np.array([
    [0.85, 0.15, 0],
    [0.98, 0, 0.02],
    [0, 0, 1]
])


def create_graph(T: np.ndarray) -> nx.DiGraph:
    G = nx.DiGraph()
    G.add_nodes_from(states)

    for i, u in enumerate(states):
        for j, v in enumerate(states):
            p = T[i][j]

            if p > 0:
                G.add_edge(u, v, p=p)

    return G



G = create_graph(T=transition_matrix)

Basically, there are two fighters that are standing in front of each other. They move around with a probability of 0.85 and strike each other with a probability of 0.15. The probability that a strike results in a knockout is 0.02. Knockout is an absorbing state and ends the fight.

What's the best/fastest way to simulate a fight and keep track of the state? So ideally, I would like my output look as follows:

['distance', 'distance', 'distance', 'distance', 'strike', 'distance', ... , 'distance', 'strike', 'knockout']

Solution

Here is a solution using just numpy.random.choice. At each timestep, you pick the row of the transition matrix for your current state to parameterise np.random.choice. E.g. if state is 0, you look at the first row, of the transition matrix [.85, .15, 0]. Now you sample the new state [0,1,2] using the probabilities [.85, .15, 0] and so on:

import numpy as np


states = [
    'distance',
    'strike',
    'knockout'
]


transition_matrix = np.array([
    [0.85, 0.15, 0],
    [0.98, 0, 0.02],
    [0, 0, 1]
])

state = 0
past_states = [state]

for a in range(1000):
    state = np.random.choice([0,1,2], p=transition_matrix[state])
    past_states.append(states[state])

If you want your markov chain to stop at knockout, you can use a while loop:

past_states = []
state = 0

while state != 2:
    state = np.random.choice([0,1,2], p=transition_matrix[state])
    past_states.append(states[state])

If you want to pitch two fighters against each other you can run two markov chains with the above stop condition. Shortest list of past states wins because there the stop condition was reached earlier.