Search code examples
centoscluster-computingjob-schedulingsungridengine

How to get the same submit command across multiple queuing environments easily?


Is there any set of standard commands for queuing systems ? I know no-one expects shell scripts to be portable, but why can't OpenLava, SunGridEngine, Platform LSF etc agree to use a common set of commands for common queuing tasks like job submission?

I shouldn't have to learn about qsub, bsub etc since they do the same thing from a user's POV.

Has someone made wrappers to accomplish the goal of cluster-platform-agnostic job submission across multiple hosts?

google search suggests that no one has set up such a standardized platform or even started agitating for it: https://www.google.com/search?q=posix+queueing+standard&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a&channel=sb#channel=sb&q=cluster+queueing+standard&rls=org.mozilla:en-US:official

tentakel and jobscheduler also look like interesting packages, but neither of them is part of CentOS.


Solution

  • A possible solution is DRMAA. I know both SGE and LSF support DRMAA, but it's not a uniform submit script, rather it's a uniform library.

    This common submit is a common problem in the Grid universe. There have been many solutions, such as:

    • Generic Job language that is translated, such as RSL (globus) and JSL (Job specification language standard, implemented by few grid middlewares).
    • Translate one submit language into another, or many others. Such as Condor's batch universe.

    Grid Middleware seems heavy handed for your usecase. Something like Bosco may work for you. Or just plain old Condor (only have to learn Condor language to submit to PBS, LSF, or SGE).

    But you are right, each scheduler comes up with their own submission language, and there is not unified method for submitting to any.