I have contingency tables RxC. How chisq.test
and fisher.test
functions work when called with arguments simulate.p.value = TRUE, B = 5000
?
I'm using the code below to validate the association (or independence) in the contingency table DATA
:
chisq.test(DATA, simulate.p.value = TRUE, B = 5000, correct = FALSE)
fisher.test(DATA,hybrid = TRUE, simulate.p.value = TRUE)
I know that the test can perform Monte Carlo simulations to estimate the p-value of the test, what I want to know is how these simulations are done internally, that is, if the simulations are made to arrive at a normal distribution or another distribution to deliver the p-value at the end of the test?
In chisq.test
and fisher.test
p-value simulation is carried out in a non-parametric way without prior distribution assumption. Please see the excerpt of source code for chisq.test
:
if (simulate.p.value) {
setMETH()
nx <- length(x)
sm <- matrix(sample.int(nx, B * n, TRUE, prob = p),
nrow = n)
ss <- apply(sm, 2L, function(x, E, k) {
sum((table(factor(x, levels = 1L:k)) - E)^2/E)
}, E = E, k = nx)
PARAMETER <- NA
PVAL <- (1 + sum(ss >= almost.1 * STATISTIC))/(B + 1)
}