I am new to multiprocessing and I need your help.
I have four variables that each of them can take up to 4 values (integers or floats) and I stored all of them in a list called par=[A, B, C, D]
. (see below)
I have created a list of possible combinations with par = itertools.product(*par)
.
Then, I call a function func1
, that takes these arguments and some more and calculates stuff. With the results of the func1
, I call another function that calculates stuff and then writes to a file.
I want to run these as a whole in parallel with multiprocessing.Pool
I thought to embed func1
and func2
in another function, called func_run
, and map this with the list par
I created above.
To summarize, my code looks like:
#values that I will use for func1
r = np.logspace(np.log10(5),np.log10(300),300)
T = 200*r
#Parameters for the sim
A = [0.1, 0.05, 0.001, 0.005]
B = [0.005, 0.025, 0.05, 0.1]
C = [20, 60, 100, 200]
D = [10, 20, 40, 80]
#Store them in a list
par = [A, B, C, D]
#Create a list with all combinations
par = list(itertools.product(*par))
def func_run(param):
for i in range(len(param)):
# Call func1
values = func1(param[i][0],param[i][1],param[i][2], param[i][3], r, T)
x = values[0]
y = values[1]
# and so on
# Call func2
results = func2(x,y,...)
z = results[0]
w = results[1]
# and so on
data_dict = {'result 1': [param[i][0]], 'result 2' : [param[i][1]]}
df = pd.DataFrame(data=data_dict)
with open(filename, 'a') as f:
df.to_csv(f, header=False)
return
Then, I call the func_run
with multiprocessing
.
from multiprocessing import Pool
pool = Pool(processes=4)
results = pool.map(func_run, par)
As a result, I get a, TypeError
with traceback:
---------------------------------------------------------------------------
RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
"""
Traceback (most recent call last):
File "/home/user/anaconda3/lib/python3.6/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/home/user/anaconda3/lib/python3.6/multiprocessing/pool.py", line 44, in mapstar
return list(map(*args))
File "<ipython-input-14-5ce94acfd95e>", line 5, in run
values = calc_val(param[i][0],param[i][1],param[i][2], param[i][3], r, T)
TypeError: 'float' object is not subscriptable
"""
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
<ipython-input-15-f45146f68f66> in <module>()
1 pool = Pool(processes=4)
----> 2 test = pool.map(run,par)
~/anaconda3/lib/python3.6/multiprocessing/pool.py in map(self, func, iterable, chunksize)
264 in a list that is returned.
265 '''
--> 266 return self._map_async(func, iterable, mapstar, chunksize).get()
267
268 def starmap(self, func, iterable, chunksize=None):
~/anaconda3/lib/python3.6/multiprocessing/pool.py in get(self, timeout)
642 return self._value
643 else:
--> 644 raise self._value
645
646 def _set(self, i, obj):
TypeError: 'float' object is not subscriptable
Unfortunately, it is impossible to add the whole functions and what are they doing because they are hundreds of lines. I hope that you can get the feeling though even though you cannot really reproduce it by yourselfs.
Is it possible to run something like this with multiprocessing or I need a different approach? It would be great if anyone can help me understand the error and make it run.
The result of
par = list(itertools.product(*par))
is a list of tuples of floats (and ints). Pool.map()
takes an iterable as the 2nd argument and maps over its items, passing them individually to given func. In other words in the function func_run(param)
param is not a list of tuples of numbers, but a tuple of numbers, and so
param[i][0]
is trying to access the ith float object's 0th item, which of course makes no sense, and so the exception. You probably should remove the for-loop in func_run()
:
def func_run(param):
values = func1(param[0], param[1], param[2], param[3], r, T)
...