I need help on a topic related to markov chains and preprocessing of data. Suppose I have the following matrix relating individuals to states over time:
ID Time1 Time2
1 14021 A A
2 15031 B A
3 16452 A C
I would like to obtain, for this matrix, the state transition matrix: Hence, what is required is
A B C
A 1 0 1
B 1 0 0
C 0 0 0
and the same thing, but now weighted by the toal number of transitions from that state, i.e,
A B C
A 0.5 0 0.5
B 1 0 0
C 0 0 0
(as there are two transitions leaving from state A). I know that the markovchain package has a functionality for doing this if one has a sequence, say AAABBAAABBCC, but not if data is set up like I have. Ideally a direct procedure would be great, but if there is some way of turning the data into a set of sequences that would work as well.
An igraph
approach, so using df
from Joseph's answer:
library(igraph)
g <- graph_from_data_frame(df)
E(g)$weight = 1/degree(g, mode="out")[df$Time1] # get counts
as_adj(g, attr = "weight", sparse=FALSE) # output weighted adjacency matrix
A B C
A 0.5 0 0.5
B 1.0 0 0.0
C 0.0 0 0.0