Search code examples
pythonmeanvariance

Online Statistics Python: Variance is not calculating correctly


SOF, I am new to Python, I have found a lot of info online however it calls for a usage of a list when calculating mean, variance etc. which I cant do. I have no problem calculating the mean for user inputs but the variance is off.

From my understanding variance is the difference between a 'number' and its 'mean' 'squared'. Maybe the problem lies there? I am not sure to be honest, this is my last resort, if you could assist in any way that would be greatly appreciated, I am also open to any advice regarding how I am writing my code.

Thanks!

My code so far:

n = input("Enter Number ")
n = int (n)
average = 0

sum = 0

for num in range(0,n+1,1):
        sum = sum + num;

mean = (sum *1.0/ n)

variance = 0

for num in range(n+1):
     sum = (num- mean)**2         

variance = (sum*1.0) 

print("Mean is: ",mean , "Variance is: ",variance)

Solution

  • First, variance is not just the number-mean squared, it is the sum of all number-mean squared divided by n (or n-1)

    The range to calculate variance should start from 1 : range(1,n+1)

    var=0
    for num in range(1,n+1):
         var = var +(num- mean)**2         
    

    Now variance can be calculated int two ways : by dividing by n or n-1 thus giving two different answers

    variance1 = (var*1.0) /n
    variance2 = (var*1.0) /(n-1)
    

    Eg : for n=10, variance1=8.25 and variance2=9.166666666666666

    n is used when you're calculating population variance, and n-1 while calculating sample variance.

    Additional details : Also, while using range if your step value is 1, it need not be specified. Use range(0,n+1) instead of range(0,n+1,1)

    Avoid using the same variable sum for both mean and variance as it will only cause confusion with the formula.