I am looking for a pythonic way to iterate through a large number of lists and use the index of repeated values from one list to calculate a total value from the values with the same index in another list.
For example, say I have two lists
a = [ 1, 2, 3, 1, 2, 3, 1, 2, 3]
b = [ 1, 2, 3, 4, 5, 6, 7, 8, 9]
What I want to do is find the unique values in a, and then add together the corresponding values from b with the same index. My attempt, which is quite slow, is as follows:
a1=list(set(a))
b1=[0 for y in range(len(a1))]
for m in range(len(a)):
for k in range(len(a1)):
if a1[k]==a[m]:
b1[k]+=b[m]
and I get
a1=[1, 2, 3]
b1=[12, 15, 18]
Please let me know if there is a faster, more pythonic way to do this. Thanks
Use the zip()
function and a defaultdict
dictionary to collect values per unique value:
from collections import defaultdict
try:
# Python 2 compatibility
from future_builtins import zip
except ImportError:
# Python 3, already there
pass
values = defaultdict(int)
for key, value in zip(a, b):
values[key] += value
a1, b1 = zip(*sorted(values.items()))
zip()
pairs up the values from your two input lists, now all you have to do is sum up each value from b
per unique value of a
.
The last line pulls out the keys and values from the resulting dictionary, sorts these, and puts just the keys and just the values into a1
and b1
, respectively.
Demo:
>>> from collections import defaultdict
>>> a = [ 1, 2, 3, 1, 2, 3, 1, 2, 3]
>>> b = [ 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> values = defaultdict(int)
>>> for key, value in zip(a, b):
... values[key] += value
...
>>> zip(*sorted(values.items()))
[(1, 2, 3), (12, 15, 18)]
If you don't care about output order, you can drop the sorted()
call altogether.