Search code examples
rstatisticsprobability

How to use pnorm in R to calculate the probability that the mean of N random variables is less than a given value


In R, lets say that the lifetime of a particular type of Calculator follows a normal distribution with Mean=5000 hours and SD = 500 hours. If I had to randomly choose a calculator, then what is the probability that it will last less than 4000 hours?

My calculation in R is as below -

pnorm(4000, mean=5000, sd =500)

[1] 0.02275013

Is my understanding correct that the probability is 0.02275013?

Next, lets say a random sample of 15 calculators is picked. What is the probability that the mean lifetime is less than 4000 hours? I am not sure how to do this in R? What I've done is

sample<-rnorm(15, mean = 5000, sd =500)
pop<-sd(sample/sqrt(15))
pnorm(4000, 4800, pop)

[1] 1.723545e-10

Is my understanding correct?


Solution

  • SO is for coding questions. This isn't much of a coding question. But here I go anyway.

    I'll start by point out that SO's guidelines state "Questions asking for homework help must include a summary of the work you've done so far to solve the problem, and a description of the difficulty you are having solving it." I'm not sure this question [edit: the question as originally asked] meets this guideline, but it's an important stats topic, so let's cover it.

    You are correct that pnorm returns the cumulative probability up to q (here q=4000) for a normal distribution with a given mean and standard deviation (here, 5000 and 500). So yes, the probably that a randomly chosen calculator lasts less than 4000 hours is 0.02275 -- that is to say that approximately 2.3% of calculators last less than 4000 hours.

    Your main question, however, is about the mean of 15 randomly chosen calculators. This statistic (the mean) will have a probability distribution. It turns out that the mean of N random variables each distributed N(mu, sigma^2) and each independent of the others has a normal distribution with the same expectation (mu) and a variance of sigma^2/N. In short:

    • If X_i ~ N(mu, sigma^2) for i=1,...,N and they're independent
    • Then mean ~ N(mu, sigma^2/N)

    So in R:

    pnorm(4000, mean=5000, sd=500/sqrt(15))
    # 4.742869e-15
    

    This is effectively zero. This makes sense because there is a low probably of randomly sampling a single calculator that lasts less than 4000 hours (only 2.3%). Randomly sampling 15 calculators that average less than 4000 hours would be extremely unlucky, and thus the probably of such an event is near zero.