Search code examples
pythonbashparallel-processingsungridengine

Submit python script to Sun Grid Engine with Joblib parallelization


I have a python script which uses joblib module with loky backend for parallelization and n_jobs is set to 12. I'm wondering how should I submit it to the sun grid engine with qsub command.

I have something like this:

qsub -q all.q     \
     -l h_vmem=14G \
     -V             \
     -N              \
     -pe all.pe       \
      12               \
      script.py

Would that work as expected for python script with joblib module?


Solution

  • Q : "Would that work as expected for python script with joblib module?"

    That heck depends on what one expects from an "as expected" clause, doesn't that?

    The Python-interpreter, as described above, will for sure instantiate 12 n_jobs-controlled amounts of self-"copied" process-instances ( performed as loky was implemented ), because it was instructed to do so.

    Yet, the qsub has requested the SGE/SoGE/*GE to pre-configure the GE so as to operate next 12 parallel-environment instances ( i.e. fully independently running 12 python processes, each having 14 [GB] RAM allocations requirement ~ 168 [GB] RAM - so hope, you have at least that amount of physical RAM-resources, otherwise a nasty swap-thrashing starts, to the devastatingly adverse effects to any expectations one may have from the joblib ( to only come later ) tricks.

    Each one such Python process, inside its own SGE/SoGE PARALLEL_ENVIRONMENT will later start it's own 12 loky-mechanised joblib-dictated n_jobs, resulting in 144 in total. The nature of their workload will determine, if these become "wasted", if disk-I/O-bound, "underperforming", if the lumpsum of the CPU/RAM-traffic has already headbanged into the physical RAM-I/O-bound ceiling, or "smart", if resources and requirements obey the Lege Artis balance throughout running the whole computing-graph ( for which my hat would be, indeed, raised in respect ).

    The Best Next Step :

    Revise the configuration with your SGE/SoGE Technical Support department, best using a pre-testing run before the main job-submission, with adding -v "Verifier" :

    ######################################################
    #               this-JOB-SUBMISSION-CONFIGURATION-file
    ######################################################
    # Usage:
    #
    #        gsub @<this-JOB-SUBMISSION-CONFIGURATION-file>
    #
    # Remarks:
    #
    #        qsub -q all.q     \      ### -q  define-LIST-OF-QUEUES 2B used 4scheduling this JOB
    #             -l h_vmem=14G \     ### -l  define-RESOURCES
    #             -V             \    ### -V  export-ALL-ENVIRONMENT-VARS to the job-CONTEXT
    #             -N              \   ### -N  qsub-JOB-NAME ? a missing value :: rather test with adding -v qsub-JOB-SUBMISSION-VERIFIER
    #             -pe all.pe 12    \  ### -pe instantiate-PARALLEL-ENVIRONMENTS <pe_name> <pe_min>[-][ <pe_max> ]
    #              script.py          ###     <command> [ <command_parameters> [...]]