Search code examples
pythonlistcompressionindices

Rapid compression of multiple lists with value addition


I am looking for a pythonic way to iterate through a large number of lists and use the index of repeated values from one list to calculate a total value from the values with the same index in another list.

For example, say I have two lists

a = [ 1, 2, 3, 1, 2, 3, 1, 2, 3]
b = [ 1, 2, 3, 4, 5, 6, 7, 8, 9]

What I want to do is find the unique values in a, and then add together the corresponding values from b with the same index. My attempt, which is quite slow, is as follows:

a1=list(set(a))
b1=[0 for y in range(len(a1))] 
    for m in range(len(a)):
        for k in range(len(a1)):
            if a1[k]==a[m]:
                b1[k]+=b[m]

and I get

a1=[1, 2, 3]
b1=[12, 15, 18]

Please let me know if there is a faster, more pythonic way to do this. Thanks


Solution

  • Use the zip() function and a defaultdict dictionary to collect values per unique value:

    from collections import defaultdict
    try:
        # Python 2 compatibility
        from future_builtins import zip
    except ImportError:
        # Python 3, already there
        pass
    
    values = defaultdict(int)
    for key, value in zip(a, b):
        values[key] += value
    
    a1, b1 = zip(*sorted(values.items()))
    

    zip() pairs up the values from your two input lists, now all you have to do is sum up each value from b per unique value of a.

    The last line pulls out the keys and values from the resulting dictionary, sorts these, and puts just the keys and just the values into a1 and b1, respectively.

    Demo:

    >>> from collections import defaultdict
    >>> a = [ 1, 2, 3, 1, 2, 3, 1, 2, 3]
    >>> b = [ 1, 2, 3, 4, 5, 6, 7, 8, 9]
    >>> values = defaultdict(int)
    >>> for key, value in zip(a, b):
    ...     values[key] += value
    ...
    >>> zip(*sorted(values.items()))
    [(1, 2, 3), (12, 15, 18)]
    

    If you don't care about output order, you can drop the sorted() call altogether.