Search code examples
pythonlistdictionaryweighted-average

How to use dictionary values to calculate sum of weighted values without explicitly defined weights?


As an example I have a dictionary as follows:

mydict = {A:['asdasd',2,3], B:['asdasd',4,5], C:['rgerg',9,10]}

How can I get just one number that is essentially the sum of the weighted values of all the keys in the dict (this example: A, B, C) weighted by the last number in the list (list[-1])?

So for instance, this would be the sum of:

(2/15)*3 + (4/15)*5 + (9/15)*10 = 7.73

Where 15 is the sum of 2,4,9.

At the moment I'm creating too many variables that just iterate through the dictionary. I'm sure there's a quick efficient Python way to do it.


Solution

  • Using the fact that (2/15)*3 + (4/15)*5 + (9/15)*10 is just the same as (2*3 + 4*5 + 9*10)/15, you don't have to pre-calculate the total, so you could do this in one pass of the dict with reduce, but perhaps it isn't as readable.

    Turning the dict into a series of tuples:

    >>> d = {A:['asdasd',2,3], B:['asdasd',4,5], C:['rgerg',9,10]}
    >>> from functools import reduce    # Py3
    >>> x,y = reduce(lambda t, e: (t[0]+e[0], t[1]+e[1]), ((a*b, a) for _,a,b in d.values()))
    >>> x/y   # float(y) in Py2
    7.733333333333333
    

    Or without the intermediate generator and using an initial value (probably the fastest):

    >>> x,y = reduce(lambda t, e: (t[0]+e[1]*e[2], t[1]+e[1]), d.values(), (0,0))
    >>> x/y   # float(y) in Py2
    7.733333333333333
    

    Or you could zip up the results and sum:

    >>> x,y = (sum(x) for x in zip(*((a*b, a) for _,a,b in d.values())))
    >>> x/y   # float(y) in Py2
    7.733333333333333
    

    Or you could do what @ranlot suggests though I would avoid the intermediate lists:

    >>> sum(a*b for _,a,b in d.values())/sum(a for _,a,b in d.values())
    7.733333333333333
    

    Though this seems to be fractionally slower than either of the ones above.