Search code examples
perlawksedgrep

Is there a fairly portable and succinct method for parsing an environment variable from a script?


I have come up with a few solutions, but I don't love them. I'm wondering if there's a better way. I'm mainly looking for something that's succinct and doesn't require flags and is usable on most any unix system. I'm also not entirely sure which one of the below is most portable. As far as I know, the only one that's not is the gawk solution.


example file

I want to parse the value of the BAR variable

# a comment
FOO="ENV_FOO"
BAR="ENV_BAR"
textfile="# a comment\nFOO=\"ENV_FOO\"\nBAR=\"ENV_BAR\""

# awk: split on "=" delimiter 
echo $textfile | awk -F "=" '/^BAR=/ { gsub(/"/,"",$2); print $2 }'

# awk: replace beginning of string with empty string; handle quotes with tr
echo $textfile | awk '/^BAR=/ { gsub(/^.*BAR=/, ""); print }' | tr -d '"'

# gawk: most straight-forward to me but not portable or DRY
echo $textfile | gawk '/^BAR=/ { print gensub(/^.*BAR="(.*)"$/, "\\1", "g") }'

# grep + sed
echo $textfile | grep ^BAR= | sed -E 's#^.*"(.*)"$#\1#'

# sed only
echo $textfile | sed -nE 's#^BAR="(.*)"$#\1#p'

# perl: maybe I just need to work on remembering these flags as this is succint
echo $textfile | perl -alE 'print $1 if /^BAR="(.*)"$/'

Each of them is straightforward in its own way but many require remembering special flags that need to be included in order to work. Is there another standard unix tool that handles this use-case I'm not thinking of?


Solution

  • perl -wnE'/^BAR="([^"]+)/ and say $1' file
    

    or

    perl -wlne'/^BAR="([^"]+)/ and print $1' file
    

    so to not enable all features with -E. The -w enables warnings, can probably drop it here.


    As for "remembering these flags" the basics are very reasonable

    • -e tells the interpreter to Evaluate as code what comes between the quotes; this is what makes it a "one-liner," a program on the command line. It must come right before the program in quotes

    • -n opens a file and feeds the program a line at a time, for all submitted files; this is what you want when working with files. The -p does the same, and prints every (processed) line

    That's that, for most common needs. So perl -ne'...' file runs code in '' (along with effects of other switches) on every line of the file; I also always throw in -w.

    There are quite a few other switches of course, described in perlrun, for more specific conveniences or uses. A few prominent ones

    • -M loads a module, as -MModuleName. Can also specify functions to import, see docs

    • -0777reads the whole file at once ("slurp"). This sets the input record separator ($/) so that the whole file is seen as one "line" -- so we still also need -n.

    • -C followed by number/list for Unicode features, for example -CASD

    • -l used above, to handle Line endings, strip them on input and append for output

    Generally the line (file in slurp mode) goes into $_ variable, the all-round default in Perl.

    To see the code very close to what Perl runs for a given one-liner add -MO=Deparse to the switches, which uses B::Deparse compiler backend (via O module)