I am trying to follow this test:
Suppose I have the following data:
set.seed(123)
active_MJO <-c(6L, 2L, 11L, 20L, 62L, 15L, 2L, 51L, 58L, 100L, 45L, 44L, 49L,
86L, 28L, 1L, 1L, 40L, 79L, 99L, 86L, 50L, 9L, 78L, 45L, 100L,
77L, 44L, 45L, 93L)
inactive_MJO <-c(83L, 170L, 26L, 66L, 156L, 40L, 29L, 72L, 109L, 169L, 153L,
136L, 169L, 133L, 153L, 13L, 24L, 148L, 121L, 80L, 125L, 21L,
135L, 155L, 161L, 171L, 124L, 177L, 167L, 162L)
I dont know how to implement the above test in R.
I have tried the following but I am not sure if this is correct.
sig.test <- function (x){
a <- sample(active_MJO)
b <- sample(inactive_MJO)
sum(a > b)
}
runs <- 1000
sim <- sum(replicate(runs,sig.test(dat))+1)/(runs+1)
I think the above is not correct. Where can I put the 950/1000 condition?
Apologies, I am new to bootstrapping/Monte Carlo test.
I'll appreciate any help on this.
Sincerely, Lyndz
First, it's important to note that they are sampling 30 frequency pairs. Since it's bootstrapping, those samples will be with replacement.
Then they compare the average active to average inactive. This is equivalent to:
They repeat the process 1000 times then compare the results of the 1000 comparisons to 950.
The following code performs #2:
set.seed(123)
active_MJO <-c(6L, 2L, 11L, 20L, 62L, 15L, 2L, 51L, 58L, 100L, 45L, 44L, 49L,
86L, 28L, 1L, 1L, 40L, 79L, 99L, 86L, 50L, 9L, 78L, 45L, 100L,
77L, 44L, 45L, 93L)
inactive_MJO <-c(83L, 170L, 26L, 66L, 156L, 40L, 29L, 72L, 109L, 169L, 153L,
136L, 169L, 133L, 153L, 13L, 24L, 148L, 121L, 80L, 125L, 21L,
135L, 155L, 161L, 171L, 124L, 177L, 167L, 162L)
diff_MJO <- active_MJO - inactive_MJO
sim <- sum(replicate(1e3, sum(sample(diff_MJO, 30, replace = TRUE)) > 0))
> sim
[1] 0
In this case, none of the 1000 replications resulted in an average active_MJO
that was greater than the average inactive_MJO
. This is unsurprising after plotting the histogram of sums of bootstrapped differences:
diff_MJO <- replicate(1e5, sum(sample(diff_MJO, 30, replace = TRUE)))
hist(diff_MJO)