I will explain my issue using an example:
A=[[1,2,10],[1,2,10],[3,4,5]]
B=[[1,2,30],[6,7,9]]
From these lists of lists, i would like to create a third one:
C=A+B
So i get :
C= [[1, 2, 10], [1, 2, 10], [3, 4, 5], [1, 2, 30], [6, 7, 9]]
Notice that there are three lists inside C
,
the [1, 2, 10], [1, 2, 10], [1, 2, 30]
lists, which if described in terms of [x,y,z], they have the same x,y but different z.
So i would like to have this new list:
Averaged= [(1, 2, 16.666), (6, 7, 9), (3, 4, 5)]
where we find only one occurrence of the same x,y from lists
[1, 2, 30], [1, 2, 40], [1, 2, 50]
and the average of the corresponding z values (10+10+30)/3=16.666
I tried using for
loops at the beginning but ended up trying to do this using defaultdict
.
I ended up with this that keeps once the (x,y) but adds and not averages the corresponding z values:
from collections import defaultdict
Averaged=[]
A=[[1,2,10],[1,2,10],[3,4,5]]
B=[[1,2,30],[6,7,9]]
C=A+B
print "C=",C
ToBeAveraged= defaultdict(int)
for (x,y,z) in C:
ToBeAveraged[(x,y)] += z
Averaged = [k + (v,) for k, v in ToBeAveraged.iteritems()]
print 'Averaged=',Averaged
Is it possible to do this with defaultdict? Any ideas?
You'll need to sort the data first:
>>> C = sorted(A + B)
>>> def avg(x):
return sum(x) / len(x)
>>> [[avg(i) for i in zip(*y)] for x,y in
itertools.groupby(C, operator.itemgetter(0,1))]
[[1.0, 2.0, 16.666666666666668], [3.0, 4.0, 5.0], [6.0, 7.0, 9.0]]
If you just want the groups before the average:
[list(y) for x,y in itertools.groupby(C, operator.itemgetter(0,1))]