I'm diving into the multiprocessing world in python.
After watching some videos I came up with a question due to the nature of my function.
This function takes 4 arguments:
# process_data(file, signals_dict, parameter_dict, debug_mode=False)
file_list = [...]
t1 = time.time()
with concurrent.futures.ProcessPoolExecutor() as executor:
executor.map(process_data, file_list)
t2 = time.time()
The question is: How can I specify the remaining parameters to the function?
Thanks in advance
ProcessPoolExecutor.map
documentation is weak. The worker accepts a single parameter. If your target has a different call signature, you need to write an intermediate worker that is passed a container and knows how to expand that into the paramter list. The documention also fails to make it clear that you need to wait for the job to complete before closing the pool. If you start the jobs and exit the pool context with
clause, the pool is terminated.
import concurrent.futures
import os
def process_data(a,b,c,d):
print(os.getpid(), a, b, c, d)
return a
def _process_data_worker(p):
return process_data(*p)
if __name__ == "__main__":
file_list = [["fooa", "foob", "fooc", "food"],
["bara", "barb", "barc", "bard"]]
with concurrent.futures.ProcessPoolExecutor() as executor:
results = executor.map(_process_data_worker, file_list)
for result in results:
print('result', result)