Search code examples
pythonbashterminalsubprocessbackground-process

How to run multiple instances of the same Python script which uses subprocess.call


I have a Python script job.py which accepts command-line arguments. The script uses the Python package subprocess to run some external programs. Both the script and the external programs are sequential (i.e. no MPI, openMP, etc.). I want to run this script 4 times, each time with different command-line arguments. My processor has 4 cores and therefore I would like to run all 4 instances simultaneously. If I open 4 terminals and run each instance of the script in the separate terminals it works perfectly and I get exactly what I want.

Now I want to make it easier for myself to launch the 4 instances such that I can do all of this with a single command from a single terminal. For this I use a bash script batch.sh:

python job.py 4 0 &
python job.py 4 1 &
python job.py 4 2 &
python job.py 4 3 &

This does not work. It turns out that subprocess is the culprit here. All the Python code runs perfectly until it hits subprocess.call after which I get:

[1]+  Stopped                 python job.py 4 0

So how I see it, is that I am trying to run job.py in the background and job.py itself tries to run something else in the background via subprocess. This apparently does not work for reasons I do not understand.

Is there a way to run job.py multiple times without requiring multiple terminals?

EDIT #1

On recommendation I tried the multiprocessing, thread and threading packages. In the best case just one instance ran properly. I tried an ugly workaround which does work. I made a bash script which launches each instance in a new terminal:

konsole -e python job.py 4 0
konsole -e python job.py 4 1
konsole -e python job.py 4 2
konsole -e python job.py 4 3

EDIT #2

Here is the actual function that uses subprocess.call (note: subprocess is imported as sp).

def run_case(path):
    case = path['case']
    os.chdir(case)
    cmd = '{foam}; {solver} >log.{solver} 2>&1'.format(foam=CONFIG['FOAM'],
                                                       solver=CONFIG['SOLVER'])
    sp.call(['/bin/bash', '-i', '-c', cmd])

Let me fill in the blank spots:

  • CONFIG is a globally defined dictionary.
  • CONFIG['FOAM'] = 'of40' and this is an alias in my .bashrc used to source a file belonging to the binary I'm running.
  • CONFIG['SOLVER'] = 'simpleFoam' and this is the binary I'm running.

EDIT #3

I finally got it to work with this

def run_case():
    case = CONFIG['PATH']['case']
    os.chdir(case)
    cmd = 'source {foam}; {solver} >log.simpleFoam 2>&1'.format(foam=CONFIG['FOAM'],
                                                                solver=CONFIG['SOLVER'])
    sp.call([cmd], shell=True, executable='/bin/bash')

The solution was to set both shell=True and executable='/bin/bash' instead of including /bin/bash in the actual command-line to pass to the shell. NOTE: foam is now a path to a file instead of an alias.


Solution

  • You can parallelize from within Python:

    import multiprocessing
    import subprocess
    
    def run_job(spec):
        ...
        if spec ...:
            subprocess.call(...)
    
    def run_all_jobs(specs):
        pool = multiprocessing.Pool()
        pool.map(run_job, specs)
    

    It has the advantage of letting you monitor/log/debug the parallelization.