I am trying to modify the code found here to use multiprocessing: https://github.com/Sensory-Information-Processing-Lab/infotuple/blob/master/body_metrics.py
In the primal_body_selector
function, I want to run lines 146-150 in parallel:
for i in range(len(tuples)):
a = tuples[i][0]
B = tuples[i][1:]
infogains[i] = mutual_information(M, a, B, M.shape[0]/10, dist_std, mu)
I believe this could lead to significant performance gains because the mutual_information
function (code here) is mainly just matrix math so multiprocessing should really help.
However, when I try to use a simple pool = ThreadPool(processes=8)
at the top of the file (called from a separate main()
method, so pool
is initialized on import) and then run the below command in place of the loop code listed above:
def infogains_task_function(i, infogains, M, tuples, dist_std, mu):
a = tuples[i][0]
B = tuples[i][1:]
infogains[i] = mutual_information(M, a, B, M.shape[0], dist_std, mu)
................
# inside primal_body_selector
pool.starmap(infogains_task_function,
[(i, infogains, M, tuples, dist_std, mu) for i in range(len(tuples))],
chunksize=80)
This code chunk is twice as slow as before (2 vs 4 seconds) as measured by time.time()
. Why is that? Regardless of which chunk size I pick (tried 1, 20, 40, 80), it's twice as slow.
I originally thought serializing M and tuples could be the reason, but M is a 32x32 matrix and tuples is 179 tuples of length 3 each so it's really not that much data right?
Any help would be greatly appreciated.
Neither multiprocessing nor multithreading are magical silver bullets... You are right that multiprocessing is a nice tool for heavy computations on multi-processor systems (or multi-core processors which is functionaly the same).
The problem is that sharing operations on a number of threads or processes adds some complexity: you have to share or copy some memory, gather the results at some time and synchronize everything. So for simple tasks, the overhead is higher than the gain.
Worse, if you carefully split your tasks manually you may reduce the overhead. But when you use a generic tool (even a nicely crafted one like the Python standard library), you should be aware that its creators had to take care of many use cases and include a number of tests in their code... again with added complexity. But the manual way dramatically increases the development (and testing) cost...
What should you remember of that: use simple tools for simple tasks, on only go with multi-x things when they are really required. Some real use cases:
BTW, for simple computation tasks like matrix operations, numpy/scipy are probably far better suited than raw Python processing...