Search code examples
sungridengine

Requesting nodes by numbers and their names in SGE


  1. How to request the number of nodes (not procs), while job submission in SGE?

    for e.g. In TORQUE, we can specify qsub -l nodes=3

  2. How to request the nodes by their names in SGE?

    for e.g. In TORQUE, we can do this by qsub -l nodes=abc+xyz+pqr, where abc, xyz and pqr are hostnames

    For single hostname, qsub -l hostname=abc it works. But how do I delimit multiple hostnames in SGE?


Solution

  • Requesting the number of nodes with Grid Engine is done indirectly. When you want to submit a parallel job then you have to request a parallel environment (man sge_pe) together with the amount of slots (processors etc) like qsub -pe mytestpe 12...

    Depending on the allocation_rule defined in the parallel environment (qconf -sp mytestpe) the slots are distributed over one or more nodes. If you have a so called fixed allocation rule where you just add a certain number as allocation rule like 4 (4 slots per host) it is easy. If you like one host just submit with -pe mytestpe 4 if you want 10 nodes just submit with -pe mytestpe 40.

    Node name can be requested by the -l h=abc. Since node names are RESTRINGS (regular expression strings) in Grid Engine you can create a regular expression for host filtering: qsub -l h="abc|xyz". You can also create host groups (qconf -ahgrp) and request so called queue domains (qsub -q all.q@@mygroup).

    Daniel

    http://www.gridengine.eu