I have a requirement to update the existing mean when new set of data is coming. For example, Say I have already calculated mean of a list of numbers and kept.
from statistics import mean
l1=[1,0,1,0,0,0,0,1,1,1,0,1]
m1=mean(l1)
print(m1)
0.5
Then say I get a new list of numbers
l2=[1,0,1,1,1,1,1,1,1,0,0,0,1,0,0,0,0,1,1,1,0,1]
m2=mean(l2)
print(m2)
0.5909090909090909
Now if I take the mean of m1 & m2 w.r.t the lists separately , they are different.
m3=mean([m1,m2])
print(m3)
0.5454545454545454
m3=mean(l1+l2)
print(m3)
0.5588235294117647
So, basically, how do I calculate new correct mean m3 only by using length of l1 , m1 & l2 ? (I do not have the contents of l1 any more. But, I can get the length)
You can do it easily if you know the length l1 and l2
from statistics import mean
l1 = [1,0,1,0,0,0,0,1,1,1,0,1]
m1 = mean(l1)
len1 = len(l1)
l2 = [1,0,1,1,1,1,1,1,1,0,0,0,1,0,0,0,0,1,1,1,0,1]
m2 = mean(l2)
len2 = len(l2)
m3 = (len1*m1 + len2*m2) / (len1+len2)
print(m3)
Output:
0.5588235294117647