Search code examples
pythonnumpyprobabilitynormal-distributionprobability-density

Calculate and plot the theoretical normal distribution N(2, 1) in the interval of [-1, 5]


I've been giving the task to calculate and plot the normal distribution N(2, 1) in the interval of [-1, 5]

Here's what I've tried:

vec = np.random.norm(2, 1, 7);
ND = stats.norm(2, 1).pdf(vec)
x = np.arange(1, 6, 1)
plt.figure()
plt.plot(x, 'r')
plt.hist(ND)
plt.show()

As you may have figured out this does not give me the result I'm looking for.

I cannot for the life of me figure this out. Please note that I am a student that has only recently started coding in Python.

I have been asked to generate random numbers with np.random.normal ranging from -1 to 5. However I have yet to understand how I can do that considering the the interval begins at -1.

Secondly I have been asked to use the function norm.pdf from scipy.stats but I do not understand the documentation on this function (https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.norm.html)

Finally I have to plot the results.


Solution

  • To specify an N(2,1) distribution is to say that you want a normal distribution with mean 2 and variance (or standard deviation) 1. In scipy terms, mean is equivalent to location, and standard deviation to scale.

    To make a plot of the pdf using matplotlib you chooose sufficient points on the interval [-1, 5] to make a visually smooth graph. This is the purpose of linspace. For each of these points you calculate its pdf using norm.pdf.

    from scipy.stats import norm
    from matplotlib import pyplot as plt
    import numpy as np
    
    x = np.linspace(-1, 5, 100, endpoint=True)
    pdf = [norm.pdf(_, loc=2, scale=1) for _ in x]
    
    plt.plot(x, pdf, 'b-')
    plt.show()
    

    Here I create a sample of size 10. I know that norm.rvs will produce deviates over the entire real line; therefore, to obtain deviates for the desired interval I simply ignore the ones outside that interval. Each invocation of `norm.rvs' produces a numpy 'array' of length one. To obtain a nice result I select just the first item in that array and append it to the overall sample (if it's within the desired interval).

    sample_size = 10
    sample = []
    while len(sample)<sample_size:
        while True:
            deviate = norm.rvs(loc=2, scale=1, size=1)[0]
            if -1<=deviate<=5:
                break
        sample.append(deviate)
    print (sample)