I would like to generate a sample of mean = 0, sd = 1 and size n = 100 which distribution is as normal as possible. Using rnorm alone returns a lot of variability.
The only way I found was to average multiple rnorms.
rowMeans(replicate(10000, sort(rnorm(100, 0, 1))))
This returns a rather satisfying result, but I'm not sure it's the most efficient way of doing it.
I don't want the mean and sd to be strictly equal to 0 and 1, but rather, the distributin to "look" like a normal distribution (when plotting the density curve).
It seems that the qnorm method works worse than the "average" method:
# qnorm method
x <- qnorm(seq(.00001, .99999, length.out = 100), mean=0, sd=1)
plot(density(x))
# average method
x <- rowMeans(replicate(10000, sort(rnorm(100, mean=0, sd=1))))
plot(density(x))
I would be pleased with a deterministic solution returning results close to the average method in a more efficient way.
Based on the answers, the following seems to work, adjusting the bounds relatively to n:
x <- qnorm(seq(1/n, 1-1/n, length.out = n), mean=0, sd=1)
Below a comparison of the qnorm and average methods for different values of n:
par(mfrow=c(6,2))
for(n in c(10, 20, 100, 500, 1000, 9876)){
x <- qnorm(seq(1/n, 1-1/n, length.out = n), mean=0, sd=1)
plot(density(x), col="blue", lwd=2)
x <- rowMeans(replicate(10000, sort(rnorm(n, mean=0, sd=1))))
plot(density(x), col="red", lwd=2)
}
You can use the bayestestR package:
library(bayestestR)
x <- rnorm_perfect(n = 100, mean = 0, sd = 1)
plot(density(x))