Search code examples
pythonubuntumpicluster-computingmpich

Unable to correctly run a personal job on the Beowulf cluster. Example job works fine


I've recently set up a Beowulf cluster using one master node and two client nodes. The client nodes all share the master node's /home/mpiuser/ directory and automatically update whenever the directory is changed on the master node. I have successfully run the example compiled cpi file that was given when downloading MPICH2 with the following command

$ mpiexec -f hosts -n 3 /home/mpiuser/mpich2-1.4.1/examples/cpi

which gives the following output

Process 0 of 3 is on Master
Process 2 of 3 is on Slave2
Process 1 of 3 is on Slave1
pi is approximately 3.1415926544231318, Error is 0.0000000008333387
wall clock time = 0.001477

Then when I try and run a python file I created here: /home/mpiuser/Development/fact_test.py, using this command

$ mpiexec -f hosts -n 3 /home/mpiuser/Development/fact_test.py

I get the following errors

[proxy:0:0@Master] HYDU_create_process (./utils/launch/launch.c:69): execvp error on file /home/mpiuser/Development/fact_test.py (Permission denied)
[proxy:0:1@Slave1] HYDU_create_process (./utils/launch/launch.c:69): execvp error on file /home/mpiuser/Development/fact_test.py (Permission denied)
[proxy:0:2@Slave2] HYDU_create_process (./utils/launch/launch.c:69): execvp error on file /home/mpiuser/Development/fact_test.py (Permission denied)

Additionally, I can also correctly get the names of the master and client nodes with this input and output:

$ mpirun --machinefile hosts hostname
Master
Slave1
Slave2

I'm not quite sure where the error is coming from. Some additional information: MPICH2 version: 1.4.1 Python version: 3.5.2

fact_test.py:

import scipy as sp
import time

def factorial_func(i):
    return sp.math.factorial(i)

if __name__ == "__main__":
    i = 1e5
    t0 = time.time()
    fac = factorial_func(i)
    t1 = time.time()
    print(t1-t0)

If you need any more information I'd be happy to provide it. Thanks!


Solution

  • can you run /home/mpiuser/Development/fact_test.py on your login node ?

    i doubt it since - there is no magic header to use the python interpreter - the file might not be executable

    one option is to add at the very beginning of your file

    #!/usr/bin/python
    

    and then

    chmod 755 /home/mpiuser/Development/fact_test.py
    

    and other option is to manually use the python interpreter, your mpiexec command would become

    mpiexec -f hosts -n 3 python /home/mpiuser/Development/fact_test.py