Search code examples
bashunixdb2pipeline

Getting chained/piped commands result to shell variable


Given a db2 proc call's output:

$ db2 "call SOME_PROC(103,5,0,'','',0,0)"

  Return Status = 0

I wish to just get the value and when I 'chain-em-up' it does not work as I think it should, so given:

$ db2 "call SOME_PROC(103,5,0,'','',0,0)" | sed -rn 's/ *Return Status *= *([0-9]+)/\1/p'
0

I try to chain 'em up:

$ var=$(db2 "call SOME_PROC(103,5,0,'','',0,0)" | sed -rn 's/ *Return Status *= *([0-9]+)/\1/p')
$ echo $var

You get nothin !

But if you redirect to tmp file:

$ db2 "call SOME_PROC(103,5,0,'','',0,0)" > /tmp/fff
$ var=$(cat /tmp/fff | sed -rn 's/ *Return Status *= *([0-9]+)/\1/p')
$ echo $var
0

You do get it ...

Similarly if you put in var:

$ var=$(db2 "call DB2INST1.USP_SPOTLIGHT_GET_DATA(103,5,0,'','',0,0)")
$ var=$(echo $var | sed -rn 's/ *Return Status *= *([0-9]+)/\1/p')
$ echo $var
0

You also get it ...

Is there a way to get value as my first attempt? And also I wonder why does it not work? What am I missing?

I also tried the below and it also givs nothing!

cat <(db2 -x "call DB2INST1.USP_SPOTLIGHT_GET_DATA(103,5,0,'','',0,0)" | sed -rn 's/ *Return Status *= *([0-9]+)/\1/p')

Solution

  • The db2 command-line interface requires that the db2 command be issued as a direct child of the parent of the command which initiated the connection. In other words, the db2 call and db2 connect commands need to be initiated from the same shell process.

    That does not interact well with many shell features:

    • pipelines: cmd1 | cmd2 runs both commands in subshells (different processes).

    • command substitution: $(cmd) runs the command in a subshell.

    • process substitution (bash): <(cmd) runs the command in a subshell.

    However, if the shell is bash, the situation is not quite that restricted, so under some circumstances the above constructions will still work with db2. In pipelines and command substitution, bash will optimize away the subshell if the command to be run in the subshell is simple enough. (Roughly speaking, it must be a simple command without redirects.)

    So, for example, if some bash process P executes

    cmd1 | cmd2
    

    then both commands have P as their parent, because they are both simple commands. Similarly with

    a=$(cmd)
    

    However, if a pipelined command or a substituted command is not simple, then a subshell is required. For example, even though { ...} does not require a subshell, the syntax is not a simple command. So in

    { cmd1; } | cmd2
    

    the first command is a child of a subshell, while the second command is a child of the main shell.

    In particular, in

    a=$(cmd1 | cmd2)
    

    bash will not optimize away the command-substitution subshell, because cmd1 | cmd2 is not a simple command. So it will create a subshell to execute the pipeline; that subshell will apply the optimization, so there will not be additional subshells to execute the simple commands cmd1 and cmd2.

    In short, you can pipeline the output of db2 call or you can capture the output in a variable, but you cannot capture the output of a pipeline in a variable.


    Other shells are more (or less) capable of subshell optimizations. With zsh, for example, you can use process substitution to avoid the subshell:

    # This will work with zsh but not with bash
    var=$(db2 "call ..." > >(sed -rn ...))
    

    The syntax cmd1 > >(cmd2) is very similar to the pipeline cmd1 | cmd2, but it differs in that is syntactically a simple command. For zsh, that is sufficient to allow the elimination of the subshell (but not for bash, which won't optimize away a subshell if the command involves a redirection).