Search code examples
bashcentosparallel-processing

Run a looped process in bash across multiple cores


I have a shell script that contains the following loop.

i=0  
upperlimit=$verylargevariable  
do  
   complexstuff RunManager file $i  
   i= 'expr $i +1'  
done

This script runs on a quad core machine, and according to top, uses about 15% of each core while executing one iteration of the loop. I'd like to distribute it across the four cores so that each iteration of the loop does complexstuff four times, one on each core, so the resources will be used more efficiently. We're talking about computation that currently takes several hours so efficiency is more than just good practice here. (The output of each iteration is obviously independent of the previous one.)

PS: Host is a server running Cent-OS, if that helps.


Solution

  • Apart the Ole Tange solution (that looks great), if your computations have pretty similar durations, you can try something like this :

    i=0  
    upperlimit=$verylargevariable  
    do  
       complexstuff RunManager file $i &
       i= 'expr $i + 1'
       complexstuff RunManager file $i &
       i= 'expr $i + 1'
       complexstuff RunManager file $i &
       i= 'expr $i + 1'
       complexstuff RunManager file $i &
       i= 'expr $i + 1'
       wait
    done
    

    This way, on each run of the loop, you will create 4 bash subprocesses that will launch your computations (and as system is great, it will dispatch them on the different cores). If with 4 processes it is not enough to burn all your cpus, raise the number of processes created on each loop.