bash openmp cluster-computing openmpi sungridengine

Best practice for job resource scaling (environment) on cluster computing?

I am quite new to programming on a cluster and am having great difficulty finding my way around. I am on SGE with bash cluster and using OpenMPI.

I have a task where I want to run several variations of my process, where the only difference is with different configurations in that I will allocate more resources to my program. Take this example:

#$ -pe openmpi $process_num

Here I am allocating process_num processes to my job's environment. I want my environment to change, for example: I want to try 1, 2, and 3 for process_num in other words, I have 3 variations. I was thinking to submit an sh job containing such a simple loop as:

# ... other environment variable definitions
for process_num in 1 2 3
do
   # ... some other environment variable definitions
   #$ -pe openmpi $process_num
   mpirun ./my_prog -npernode 1
done

In other words, one 'packed' job will execute all my variations and account for the resource allocation/scaling. I thought like this I would be able to allocate different resources for all my 3 variations of jobs with each iteration. I want to ask whether this is possible to do i.e. is the job environment able to scale in the way described, or will I have to submit 3 separate jobs?

Of course, if the answer is yes - submit separate jobs, then what happens when I have some 50 such configurations I want to try? What is then the best-practice approach to then submit 50 (or a large number of) separate jobs?

Unfortunately as the cluster is a shared resource, I am not free to experiment as I would like to.

Solution

A job is 'defined' by the resources it uses. If you want to test three resource configurations, you need to submit three jobs.

The other option would be to allocate the maximal config and run the three jobs sequentially. This is what the script in the question suggests. But you would be wasting cluster resources by allocating but not using CPUs.

The best practice is to use all resources you allocate to the fullest possible extent.

It's easy to submit multiple jobs via a script on the front end node. I believe SGE uses qsub, so it would be something like parallel "qsub -pe openmpi {} -v CPUS={} -l n_cpus={} test-job.sh" ::: 1 2 3. The exact syntax of the qsub depends a lot on your environment. In test-job.sh you would use the $CPUS to start your mpi job correctly (not sure if this is needed, maybe the correctly initialized SGE parallel environment -pe will be enough). I'm using parallel instead of bash loop just because of the nicer and more compact syntax, it does not make a difference.