Search code examples
pythonexcelnormal-distribution

Drawing a normal curve in python


i have seen a lot of documentation about normal distribution and curve sketching in python and i am a bit confused about it, i Have generated normal random variables with mean 30 and standard deviation 3.7 , and using function norm.dist i have estimated pdf function

=NORM.DIST(A2,$H$2,$I$2,FALSE)

on the based of this formula, i sketched scatter chart and i have got
enter image description here

i want for demonstration purpose sketch the same using python, i found scipy and numpy version, please help me clarify things clearly, here are some set of my numbers

enter image description here

i have tried following code

from scipy.stats import norm
import  pandas as pd
import matplotlib.pyplot as plt
data_random =pd.read_excel("data_for_normal.xlsx")
data_values =data_random["NormalVariables"].values
pdf_values =norm.pdf(data_values,30,3.7)
plt.plot(data_values,pdf_values)
plt.title("normal curve")
plt.xlabel("x values")
plt.ylabel("probability density function")
plt.show()

but i have got enter image description here

result of :

print(data_random.head(10))
 NormalVariables
0        29.214494
1        30.170595
2        36.014144
3        30.388626
4        28.398749
5        24.861042
6        29.519316
7        24.207164
8        35.779376
9        26.042977

Solution

  • # plt.plot connects datapoints with lines:
    
    x = [0,1,2]
    y = [1,4,3]
    plt.plot(x,y)
    

    enter image description here

    #note that lines are drawn between adjacent elements in the list,
    #so a line from (0,1) to (1,4) and then to (2,3)
    
    # if the order of the datapoints is changed, the position of the datapoints 
    # remains unchanged, but now lines are drawn between different points
    
    x = [2,0,1]
    y = [3,1,4]
    plt.plot(x,y)
    

    enter image description here

    So the reason you see all the crisscrossing in your plot is that you plot unsorted data.

    If you simply want to replicate the plot from excel, use plt.scatter instead. This plot just the datapoints and does not draw connections between them.

    x = [2,0,1]
    y = [3,1,4]
    plt.scatter(x,y)
    

    enter image description here