Search code examples

Python: Finding number of same lists occurrences and averaging

I will explain my issue using an example:


From these lists of lists, i would like to create a third one:


So i get :

C= [[1, 2, 10], [1, 2, 10], [3, 4, 5], [1, 2, 30], [6, 7, 9]]

Notice that there are three lists inside C , the [1, 2, 10], [1, 2, 10], [1, 2, 30] lists, which if described in terms of [x,y,z], they have the same x,y but different z.

So i would like to have this new list:

Averaged= [(1, 2, 16.666), (6, 7, 9), (3, 4, 5)]

where we find only one occurrence of the same x,y from lists

[1, 2, 30], [1, 2, 40], [1, 2, 50]

and the average of the corresponding z values (10+10+30)/3=16.666

I tried using for loops at the beginning but ended up trying to do this using defaultdict.

I ended up with this that keeps once the (x,y) but adds and not averages the corresponding z values:

from collections import defaultdict

print "C=",C

ToBeAveraged= defaultdict(int)
for (x,y,z) in C:
    ToBeAveraged[(x,y)] += z
Averaged = [k + (v,) for k, v in ToBeAveraged.iteritems()]    

print 'Averaged=',Averaged

Is it possible to do this with defaultdict? Any ideas?


  • You'll need to sort the data first:

    >>> C = sorted(A + B)
    >>> def avg(x):
            return sum(x) / len(x)
    >>> [[avg(i) for i in zip(*y)] for x,y in 
         itertools.groupby(C, operator.itemgetter(0,1))]
    [[1.0, 2.0, 16.666666666666668], [3.0, 4.0, 5.0], [6.0, 7.0, 9.0]]

    If you just want the groups before the average:

    [list(y) for x,y in itertools.groupby(C, operator.itemgetter(0,1))]