Search code examples
pythonmultiprocessingpython-multiprocessingpython-multithreading

ProcessPoolExecutor increase performance in non-intuitive way


I have an application which is basically multithreaded: thread 1 is computation and thread 2 is GUI (Tkinter). One of the parts of the computation includes a function with loop. So I decided to use multiprocessing here in such a way:

def mpw1(idw_tree, mapsdata, inlines, xlines, x, y, dfattrs, calcplan, attrsdim, mdim):

    n_cores = os.cpu_count()
    flatcubec2 = np.zeros((attrsdim,mdim))

    with ProcessPoolExecutor(n_cores) as ex:
            args = ((i, calcplan, idw_tree, mapsdata, dfattrs, flatcubec2, inlines, xlines, n_cores) for i in range(n_cores))
            flatcubec2 = ex.map(circle, args)

    return flatcubec2

where circle is just a computational function (let's say it's counting something).

But what is strange, is that setting n_cores as much as possible not allows me to get the best performance. Here is some info:

8 cores (max) - 17 sec
6 cores - 14 sec
4 cores - 12 sec
3 cores - 14 sec
2 cores - 17 sec

What actually is going on? Why does using the maximum of your hardware not allows to get the best performance? Is the problem in my way of using multithreading?


Solution

  • This behavior is explained by the fact that I used wrong command (multiprocessing.cpu_count()) to set amount of processes to use, this command return me twice more than it should be, in mp it is necessary to use only physical cpu, without logical So, this behavior, when after 4th "worker" (maximum of physical cpu's in my case) perfomance starts to decrease can be explained by the fact that multiprocessing works explicit and predictable with physical cpu's only. To get amount of physical cpu's only I used :

    psutil.cpu_count(logical = False)