I'll probably want to hit myself over the head for not getting this:
How do I generate a vector with the expected height of a normal distribution over Y bins (nbins
in the below), of exactly N elements.
Like so, in the below picture:
nbins
= 15nstat
= 77I know I could draw rnorm(77)
, but that'll never be exactly normal, and looping over 10.000 iterations or so seems overkill.
So I tried using qnorm
for that purpose, but I have a hunch that:
Here is what I got:
nbins <- 15
nstat <- 77
item.pos <- qnorm( # to the left of which value lies...
1:(nstat) / (nstat+1)# ... the n-statement?
# using nstat + 1 because we want midpoints, not cutoffs for later
)
bins <- cut(
x = item.pos,
breaks = nbins,
ordered_result = TRUE
)
height <- summary(bins)
height <- as.numeric(bins)
If your range of data is from -2:2
with 15
intervals and the sample size is 77
I would suggest the following to get the expected heights of the 15 intervals:
rn <- dnorm(seq(-2,2, length = 15))/sum(dnorm(seq(-2,2, length = 15)))*77
[1] 1.226486 2.084993 3.266586 4.716619 6.276462 7.697443 8.700123 9.062576 8.700123 7.697443
[11] 6.276462 4.716619 3.266586 2.084993 1.226486
The barplot of this looks like:
barplot(height = rn, names.arg = round(seq(-2, 2, length = 15), 2))
So, in your sample of 77
you would get the first value of the sequence in 1.226486
, the second value in 2.084993
cases, etc. Its difficult to generate a vector as you described at the beginning, because the sequence above does not consist of integers.