I'm attempting to generate a sample from an n-order transition matrix using Markov Chains in R. I've successfully constructed this n-order transition matrix using the following code:
set.seed(1)
dat <- sample(c("A", "B", "C"), size = 2000, replace = TRUE) # Data
n <- 2 # Order of the transition matrix
if (n > 1) {
from <- head(apply(embed(dat, n)[, n:1], 1, paste, collapse = ""), -1)
to <- dat[-1:-n]
} else {
from <- dat[-length(dat)]
to <- dat[-1]
}
fromTo <- data.frame(cbind(from, to))
TM <- table(fromTo)
TM <- TM / rowSums(TM) # Transition matrix
However, I'm facing difficulties in writing a code that generates a sample using the generated transition matrix which adapts to varying values of n. Is there a way to do it?
Ideally, I'd prefer a solution that doesn't involve the 'markovchain' package due to compatibility issues across different R versions.
If you are just wondering how to generate a sample from the given transition matrix, you can try the code below for example (on top of the MarkovChain
function built in the previous answer)
MarkovChainSampling <- function(dat, ord, preStat){
TM <- MarkovChain(dat, ord)
sample(colnames(TM), 1, prob = TM[preStat, ])
}
such that
> MarkovChainSampling(dat, 2, "A")
[1] "C"
> MarkovChainSampling(dat, 3, "AB")
[1] "A"
> MarkovChainSampling(dat, 4, "AAA")
[1] "C"
I think you are after the transition matrix of Markov Chain of order n
. Below is one option where you might find some clues.
You can use embed
like below
MarkovChain <- function(dat, ord) {
d <- as.data.frame(embed(dat, ord))
df <- with(
d,
data.frame(
pre = do.call(paste, c(d[-ord], sep = "")),
cur = d[[ord]]
)
)
proportions(table(df), 1)
}
and you will obtain
> MarkovChain(dat, 2)
cur
pre A B C
A 0.3377386 0.3509545 0.3113069
B 0.3333333 0.3348281 0.3318386
C 0.3513097 0.3174114 0.3312789
> MarkovChain(dat, 3)
cur
pre A B C
AA 0.3347826 0.3826087 0.2826087
AB 0.3430962 0.3263598 0.3305439
AC 0.3396226 0.3160377 0.3443396
BA 0.3273543 0.2959641 0.3766816
BB 0.3392857 0.3482143 0.3125000
BC 0.3783784 0.3063063 0.3153153
CA 0.3524229 0.3700441 0.2775330
CB 0.3155340 0.3300971 0.3543689
CC 0.3348837 0.3302326 0.3348837