Search code examples
bashpbsquotingtorque

Torque PBS passing environment variables that contain quotes


I have a python script. Normally I would run this like this:

./make_graph data_directory "wonderful graph title"

I have to run this script through the scheduler. I am using -v to pass the arguments for the python script through qsub.

qsub make_graph.pbs -v ARGS="data_directory \"wonderful graph title\""

I have tried many combinations of ', ", \" escaping and I just can't get it right. The quoting around 'wonderful graph title' is always either lost or mangled.

Here is an excerpt from the pbs script

if [ -z "${ARGS+xxx}" ]; then
        echo "NO ARGS SPECIFIED!"
        exit 1
fi

CMD="/path/make_graph $ARGS"
echo "CMD: $CMD"

echo "Job started on `hostname` at `date`"
${CMD}

What is the proper way to pass a string parameter that contains spaces through qsub as an environment variable? Is there a better way to do this? Maybe this is a more general bash problem.


Solution

  • Update: This answer is based on SGE qsub rather than TORQUE qsub, so the CLI is somewhat different. In particular, TORQUE qub doesn't seem to support direct argument passing, so the second approach doesn't work.


    This is mainly a problem of proper quoting and has little to do with grid engine submission itself. If you just want to fix your current script, you should use eval "${CMD}" rather than ${CMD}. Here's a detailed analysis of what happens when you do ${CMD} alone (in the analysis we assume there's nothing funny in path):

    1. Your qsub command line is processed and quotes removed, so the ARGS environment variable passed is data_directory "wonderful graph title".

    2. You did CMD="/path/make_graph $ARGS", so the value of CMD is /path/make_graph data_directory "wonderful graph title" (I'm presenting the string literal without quoting, that is, the value literally contains the quote characters).

    3. You did ${CMD}. Bash performs a parameter expansion on this, which amounts to:

      1. Expanding ${CMD} to its value /path/make_graph data_directory "wonderful graph title";
      2. Since ${CMD} is not quoted, perform word splitting, so in the end the command line has five words: /path/make_graph, data_directory, "wonderful, graph, title". The last four are treated as arguments to your make_graph, which is certainly not what you want.

    On the other hand, if you use eval "${CMD}", then it is as if you typed /path/make_graph data_directory "wonderful graph title" into an interactive shell, which is the desired behavior.

    You should read more about eval, parameter expansion, etc. in the Bash Reference Manual.

    The corrected script:

    #!/usr/bin/env bash
    [[ -z ${ARGS+xxx} ]] && { echo "NO ARGS SPECIFIED!" >&2; exit 1; }
    
    CMD="/path/make_graph ${ARGS}"
    echo "CMD: ${CMD}"
    
    echo "Job started on $(hostname) at $(date)" # backticks are deprecated
    eval "${CMD}"
    

    By the way, to test this, you don't need to submit it to the grid engine; just do

    ARGS="data_directory \"wonderful graph title\"" bash make_graph.pbs
    

    Okay, I just pointed out what's wrong and patched it. But is it really the "proper way" to pass arguments to grid engine jobs? No, I don't think so. Arguments are arguments, and should not be confused with environment variables. qsub allows you to pass arguments directly (qsub synopsis: qsub [ options ] [ command | -- [ command_args ]]), so why encode them in an env var and end up worrying about quoting?

    Here's a better way to write your submission script:

    #!/usr/bin/env bash
    [[ $# == 0 ]] && { echo "NO ARGS SPECIFIED!" >&2; exit 1; }
    
    CMD="/path/make_graph $@"
    echo "CMD: ${CMD}"
    
    echo "Job started on $(hostname) at $(date)" # backticks are deprecated
    /path/make_graph "$@"
    

    Here "$@" is equivalent to "$1" "$2" ... — faithfully passing all arguments as is (see relevant section in the Bash Reference Manual).

    One thing unfortunate about this, though, is that although the command executed is correct, the one printed may not be properly quoted. For instance, if you do

    qsub make_graph.pbs data_directory "wonderful graph title"
    

    then what gets executed is make_graph.pbs data_directory "wonderful graph title", but the printed CMD is make_graph.pbs data_directory wonderful graph title. And there's no easy way to fix this, as far as I know, since quotes are always removed from arguments no matter how word splitting is done. If the command printed is really important to you, there are two solutions:

    1. Use a dedicated "shell escaper" (pretty easy to write one for yourself) to quote the arguments before printing;

    2. Use another scripting language where shell quoting is readily available, e.g., Python (shlex.quote) or Ruby (Shellwords.shellescape).