Search code examples
pythonlinuxsungridengine

How to pass an execution environment to SGE


First of all the word 'environment' has different meanings, so let me clarify what I mean.

I am working on a Python flow on Linux, and there are certain libraries, software elements and files etc (e.g yaml) that are requisite for running this flow and recognizing the custom commands. When I say environment, I mean this entire set of dependencies.

What I am thinking about is a way to encapsulate all of these 'requisites' into something (I dont know the technical term for such a thing, if this is at all possible) and pass this something to the grid engine so that all nodes on the GE do not need to have the same set of programs, libraries installed, and can use this something to run the job.

Has anyone come across such a scenario ? Is this at all possible ?

Alternatively, I have to ssh into every node and make sure those libraries etc. are installed individually.


Solution

  • I see two main options.

    1. Virtual Environments: Create a virtual environment that will be nfs mounted on your nodes. Your jobs should use that virtual environment. (Note that I think virtualenv works with full paths so you would need to mount them in the same path they are on the maaster node.)
    2. Use something like docker to package (or as you wrote, encapsulate) your executables so that all the libraries they need are self contained.