I'm just getting my feet wet with multiprocessing (and it's totally awesome!), but I was wondering if there was any guidelines to selecting number of processes? Is it just based on number of cores on the server? Is it somehow based on the application your running (number of loops, how much cpu it uses, etc)? etc...how do I decide how many processes to spawn? Right now, I'm just guessing and add/removing processes but it would be great if there was some kind of guideline or best practice.
Another question, I know what happens if I add too few (program is slow) but what if I add 'too many'?
If all of your threads/processes are indeed CPU-bound, you should run as many processes as the CPU reports cores. Due to HyperThreading, each physical CPU cores may be able to present multiple virtual cores. Call multiprocessing.cpu_count
to get the number of virtual cores.
If only p of 1 of your threads is CPU-bound, you can adjust that number by multiplying by p. For example, if half your processes are CPU-bound (p = 0.5) and you have two CPUs with 4 cores each and 2x HyperThreading, you should start 0.5 * 2 * 4 * 2 = 8 processes.
If you have too few process, your application will run slower than expected. If your application scales perfectly and is only CPU-bound (i.e. is 10 times faster when executed on 10 times the amount of cores), this means you the speed is slower in relation. For example, if your system calls for 8 processes, but you only initiate 4, you'll only use half of the processing capacity and take twice as long. Note that in practice, no application scales perfectly, but some (ray tracing, video encoding) are pretty close.
If you have too many processes, the synchronization overhead will increase. If your program is little to none synchronization overhead, this won't impact the overall runtime, but may make other programs appear slower than they are unless you set your processes to a lower priority. Excessive numbers of processes (say, 10000) are fine in theory if your OS has a good scheduler. In practice, virtually any synchronization will make the overhead unbearable.
If you're not sure whether your application is CPU-bound and/or perfectly scaling, simply observe system load with different thread counts. You want the system load to be slightly under 100%, or the more precise uptime to be the number of virtual cores.