Search code examples
pythonstatisticsmean

calculate mean of new list of number with old mean


I have a requirement to update the existing mean when new set of data is coming. For example, Say I have already calculated mean of a list of numbers and kept.

from statistics import mean
l1=[1,0,1,0,0,0,0,1,1,1,0,1]
m1=mean(l1)
print(m1)
0.5

Then say I get a new list of numbers

l2=[1,0,1,1,1,1,1,1,1,0,0,0,1,0,0,0,0,1,1,1,0,1]
m2=mean(l2)
print(m2)
0.5909090909090909

Now if I take the mean of m1 & m2 w.r.t the lists separately , they are different.

m3=mean([m1,m2])
print(m3)
0.5454545454545454
m3=mean(l1+l2)
print(m3)
0.5588235294117647

So, basically, how do I calculate new correct mean m3 only by using length of l1 , m1 & l2 ? (I do not have the contents of l1 any more. But, I can get the length)


Solution

  • You can do it easily if you know the length l1 and l2

    from statistics import mean
    
    l1 = [1,0,1,0,0,0,0,1,1,1,0,1]
    m1 = mean(l1)
    len1 = len(l1)
    
    l2 = [1,0,1,1,1,1,1,1,1,0,0,0,1,0,0,0,0,1,1,1,0,1]
    m2 = mean(l2)
    len2 = len(l2)
    
    m3 = (len1*m1 + len2*m2) / (len1+len2)
    print(m3)
    

    Output:

    0.5588235294117647