Search code examples
rnormal-distribution

How to vectorized a pair wise command in R?


I am working on a continuous distribution for which I need to test for normality. As part of the process, I am creating buckets in order to create categories. I need to test if my data is Normal with a mean of 24.9 and sd of 7.5.

I need to test normallity for the following range of value range <- c('<8', '12', '16', '20', '24', '28', '32', '36', '40', '44', '>44')

I order to find the observed value I need to perform the following computation in R in order to get the value against the Normal distribution.

obs <- c()
# total number of observed value = 62
obs <- append(obs, pnorm(8, 24.9, 7.5) * 62) # for bucket <8
obs <- append(obs, (pnorm(12, 24.9, 7.5) - pnorm(8, 24.9, 7.5)) * 62) # for bucket 12
# ...
# for bucket 16
# for bucket 20 etc.

Is there a way to make this logic vectorized such that I don't need to make a formula for every bucket?


Solution

  • Here's an idea based on taking the diff - I'm not sure what the 3rd range would look like but this is always p_norm[i] - p_norm[i-1]:

    range_x <- c(8,12,16,20,24,28,32,36,40,44,100)
    p_norm <- pnorm(range_x, 24.9, 7.5)
    
    c(p_norm[1], diff(p_norm))*62
    
     [1]  0.7513823  1.8970234  4.6477273  8.6236507 12.1191940
     [6] 12.9007877 10.4021663  6.3529978  2.9386039  1.0293193
    [11]  0.3371475