Search code examples
rrandommontecarloresamplingstatistics-bootstrap

R: Distribution of Random Samples vs. 1 Random Sample


I have a question about random sampling.

Are the two following results (A and B) statistically the same?

nobs <- 1000

A <- rt(n=nobs, df=3, ncp=0)

simulations <- 50
B <- unlist(lapply(rep.int(nobs/simulations, times=simulations),function(y) rt(n=y, df=3, ncp=0) ))

I thought it would be but now I've been going back and forth.

Any help would be appreciated.

Thanks


Solution

  • With some small changes, you can even make them numerically equal. You only need to seed the RNG and omit specifying the ncp parameter and use the default value (of 0) instead:

    nobs <- 1000
    set.seed(42)
    A <- rt(n=nobs, df=3)
    
    simulations <- 50
    set.seed(42)
    B <- unlist(lapply(rep.int(nobs/simulations, times=simulations),function(y) rt(n=y, df=3) ))
    
    all.equal(A, B)
    #[1] TRUE
    

    Why don't you get equal results when you specify ncp=0?

    Because then rt assumes that you actually want a non-central t-distribution and the values are calculated with rnorm(n, ncp)/sqrt(rchisq(n, df)/df). That means when creating 1000 values at once rnorm is called once and rchisq is called once subsequently. If you create 50 times 20 values, you have alternating calls to these RNGs, which means the RNG states are different for the rnorm and rchisq calls than in the first case.