Search code examples
pythonbashenvironment-variablespass-by-referenceslurm

Evaluate "Shell command line with shell variables" from python OR evaluate python string as shell command line


CONTEXT

I am working on a simulation cluster. In order to make as flexible as possible (working with different simulation soft) , we created a python file that parse a config file defining environment variables, and command line to start the simulation. This command is launched through SLURM sbatch command (shell $COMMAND)

ISSUE

From python, all Environment variables are enrolled reading the config file I have issue with variable COMMAND that is using other environment variables (displayed as shell variable)

For example

COMMAND = "fluent -3ddp -n$NUMPROCS -hosts=./hosts -file $JOBFILE"    
os.environ['COMMAND']=COMMAND    
NUMPROCS = "32"    
os.environ['NUMPROCS']=NUMPROCS    
[...]    
exe = Popen(['sbatch','template_document.sbatch'], stdout=PIPE, stderr=PIPE)

sbatch distribute COMMAND to all simulation nodes as COMMAND being a command line

COMMAND recalls other saved env. variables. Shell interprets it strictly as text... Which makes the command line fails. it is strictly as a string using $ not variable for example : 'fluent -3ddp -n$NUMPROCS -hosts=./hosts -file $JOBFILE'

SOLUTION I AM LOOKING FOR

I am looking for a simple solution Solution 1: A 1 to 3 python command lines to evaluate the COMMAND as shell command to echo Solution 2: A Shell command to evaluate the variables within the "string" $COMMAND as a variable At the end the command launched from within sbatch should be

fluent -3ddp -n32 -hosts=./hosts -file /path/to/JOBFILE

Solution

  • You have a few options:

    1. Partial or no support for bash's variable substitution, e.g. implement some python functionality to reproduces bash's $VARIABLE syntax.

    2. Reproduce all of bash's variable substitution facilities which are supported in the config file ($VARIABLE, ${VARIABLE}, ${VARIABLE/x/y}, $(cmd) - whatever.

    3. Let bash do the heavy lifting, for the cost of performance and possibly security, depending on your trust of the content of the config files.

    I'll show the third one here, since it's the most resilient (again, security issues notwithstanding). Let's say you have this config file, config.py:

    REGULAR   =   "some-text"
    EQUALS = "hello = goodbye" # trap #1: search of '='
    SUBST = "decorated $REGULAR"
    FANCY   = "xoxo${REGULAR}xoxo"
    CMDOUT = "$(date)"
    BASH_A = "trap" # trap #2: avoid matching variables like BASH_ARGV
    QUOTES = "'\"" # trap #3: quoting
    

    Then your python program can run the following incantation:

    bash -c 'source <(sed "s/^/export /" config.py | sed "s/[[:space:]]*=[[:space:]]*/=/") && env | grep -f <(cut -d= -f1 config.py | grep -E -o "\w+" | sed "s/.*/^&=/")'
    

    which will produce the following output:

    SUBST=decorated some-text
    CMDOUT=Thu Nov 28 12:18:50 PST 2019
    REGULAR=some-text
    QUOTES='"
    FANCY=xoxosome-textxoxo
    EQUALS=hello = goodbye
    BASH_A=trap
    

    Which you can then read with python, but note that the quotes are now gone, so you'll have to account for that.

    Explanation of the incantation:

    bash -c 'source ...instructions... && env | grep ...expressions...' tells bash to read & interpret the instructions, then grep the environment for the expressions. We're going to turn the config file into instructions which modify bash's environment.

    If you try using set instead of env, the output will be inconsistent with respect to quoting. Using env avoids trap #3.

    Instructions: We're going to create instructions for the form:

    export FANCY="xoxo${REGULAR}xoxo"
    

    so that bash can interpret them and env can read them.

    • sed "s/^/export /" config.py prefixes the variables with export.
    • sed "s/[[:space:]]*=[[:space:]]*/=/" converts the assignment format to syntax that bash can read with source. Using s/x/y/ instead of s/x/y/g avoids trap #1.
    • source <(...command...) causes bash to treat the output of the command as a file and run its lines, one by one.

    Of course, one way to avoid this complexity is to have the file use bash syntax to begin with. If that were the case, we would use source config.sh instead of source <(...command...).

    Expressions: We want to grep the output of env for patterns like ^FANCY=.

    • cut -d= -f1 config.py | grep -E -o "\w+" finds the variable names in config.py.
    • sed "s/.*/^&=/" turns variable names like FANCY to grep search expressions such as ^FANCY=. This is to avoid trap #2.
    • grep -f <(...command...) gets grep to treat the output of the command as a file containing one search expression in each line, which in this case would be ^FANCY=, ^CMDOUT= etc.

    EDIT

    Since you actually want to just pass this environment to another bash command rather than use it in python, you can actually just have python run this:

    bash -c 'source <(sed "s/^/export /" config.py | sed "s/[[:space:]]*=[[:space:]]*/=/") && $COMMAND'
    

    (assuming that COMMAND is specified in the config file).