I have a function that I apply sequentially on a list of objects and which returns me a score for each object as following :
def get_score(a):
// do something
return score
objects = [obj0, obj1, obj3]
results = np.zeros(len(objects))
index = 0
for i in range(len(results)):
results[i]=get_score(objects[i])
I want to parallelize the execution of this function whith Multiprocessing library, but I have a question, how can I tell that such a score corresponds to such an object since I will not have a shared results list ?
One possible solution is to return the index and processed object (score) from the get_score
function.
Example:
from multiprocessing import Pool
def get_score(tpl):
i, (par1, par2) = tpl
# do something
return i, f"{par1=} {par2=} processed"
if __name__ == "__main__":
par1 = ["obj1", "obj2", "obj3"]
par2 = ["par2_1", "par2_2", "par2_3"]
# ...
results = [None] * len(par1)
with Pool() as p:
# process the objects in unordered fashion
for i, result in p.imap_unordered(get_score, enumerate(zip(par1, par2))):
results[i] = result
print(results)
Prints:
[
"par1='obj1' par2='par2_1' processed",
"par1='obj2' par2='par2_2' processed",
"par1='obj3' par2='par2_3' processed",
]