I have a question about random sampling.
Are the two following results (A and B) statistically the same?
nobs <- 1000
A <- rt(n=nobs, df=3, ncp=0)
simulations <- 50
B <- unlist(lapply(rep.int(nobs/simulations, times=simulations),function(y) rt(n=y, df=3, ncp=0) ))
I thought it would be but now I've been going back and forth.
Any help would be appreciated.
Thanks
With some small changes, you can even make them numerically equal. You only need to seed the RNG and omit specifying the ncp
parameter and use the default value (of 0) instead:
nobs <- 1000
set.seed(42)
A <- rt(n=nobs, df=3)
simulations <- 50
set.seed(42)
B <- unlist(lapply(rep.int(nobs/simulations, times=simulations),function(y) rt(n=y, df=3) ))
all.equal(A, B)
#[1] TRUE
Why don't you get equal results when you specify ncp=0
?
Because then rt
assumes that you actually want a non-central t-distribution and the values are calculated with rnorm(n, ncp)/sqrt(rchisq(n, df)/df)
. That means when creating 1000 values at once rnorm
is called once and rchisq
is called once subsequently. If you create 50 times 20 values, you have alternating calls to these RNGs, which means the RNG states are different for the rnorm
and rchisq
calls than in the first case.