This contrived bash script demonstrates the issue.
#!/bin/bash
while read -r node ; do
echo checking $node for Agent;
PID=$(ssh $node ""ps -edf | grep [j]ava | awk '{print $2}'"")
echo $PID got to here.
done < ~/agents_master.list
agents_master.list contains 1 server per line:
server1
server2
server3
Which only outputs the following:
checking server1 for Agent
Authorized use only
25176 got to here
Server 2 and 3 aren't even echoed out to screen by the line echo checking $node...
If I comment out the line PID=$(....
then the while completes the whole agents_master.list file correctly...
checking server1 for Agent
got to here
checking server2 for Agent
got to here
checking server3 for Agent
got to here
From the googling I've done, it sounds like this is related to the subshell that $(...)
creates, but I don't understand why it is causing the loop to stop at the first server, server1
.
Yes, this code could be re-written but I'm keen to understand this behaviour of bash and why this is happening for future.
The problem -- one of the problems -- is that ssh
is forwarding stdin to the remote server. As it happens, the command you are running on the remote server (ps -edf
, see below) doesn't use its standard input, but ssh will still forward what it reads, just in case. As a consequence, nothing is left for read
to read, so the loop ends.
To avoid that, use ssh -n
(or redirect input to /dev/null
yourself, which is what the -n
option does).
There are a couple of other issues which are not actually interfering with your scripts execution.
First, I have no idea why you use ""
in
ssh $node ""ps -edf | grep [j]ava | awk '{print $2}'""
The ""
"expands" to an empty string, so the above is effectively identical to
ssh $node ps -edf | grep [j]ava | awk '{print $2}'
that means that the grep
and awk
commands are being run on the local host; the output from the ps
command is forwarded back to the local host by ssh
. That doesn't change anything, although it does make the brackets in [j]ava
redundant, since the grep
won't show up in the process list, as it is not running on the host where the ps
is executed. In fact, it's a good thing that the brackets are redundant, since they might not be present in the command if there happens to be a file named java
in your current working directory. You really should quote that argument.
I presume that what you intended was to run the entire pipeline on the remote machine, in which case you might have tried:
ssh $node "ps -edf | grep [j]ava | awk '{print $2}'"
and found that it didn't work. It wouldn't have worked because the $2
in the awk command will be expanded to whatever $2
is in your current shell; the $2
is not protected by interior single-quotes. As far as bash is concerned, $2
is just part of a double quoted string. (And it also would shift the issue of the argument to grep
not being quoted to the remote host, so you'll have problems if there is a file named java
in the home directory on the remote host.
So what you actually want is
ssh -n $node 'ps -edf | grep "[j]ava" | awk "{print \$2}"'
Finally, don't use PID
as the name of a shell variable. Variable names in all upper case are generally reserved, and it is perilously close to BASHPID
and PPID
, which are specific bash variables. Your own shell variables should have lower-case names, as in any other programming language.