Search code examples
bashsudosubshell

Bash - Why $(sudo cat file) cannot find a file which exists?


Question

Why $(sudo cat) cannot find a file which exists?

This works:

for host in $(cat /etc/ansible/hosts | cut -d ' ' -f 1 | grep -P '^master-' | sort | uniq)
do
    ssh ${host} /bin/bash << EOF
        sudo cat /etc/origin/master/ca-bundle.crt
EOF 
done

-----BEGIN CERTIFICATE-----
XYZ...
-----END CERTIFICATE-----

This does not work but no idea why:

for host in $(cat /etc/ansible/hosts | cut -d ' ' -f 1 | grep -P '^master-' | sort | uniq)
do
    ssh ${host} /bin/bash << EOF
        RESULT="$(sudo cat /etc/origin/master/ca-bundle.crt; echo x)"
        echo "${RESULT%x}"
EOF
done

"cat: /etc/origin/master/ca-bundle.crt: No such file or directory"

Fix

Based on the answer, fixed as below and worked. Thanks a lot.

#!/bin/bash
set -u
for host in $(cat /etc/ansible/hosts | cut -d ' ' -f 1 | grep -P '^master-' | sort | uniq)
do
    ssh ${host} /bin/bash <<'EOF'
RESULT="$(sudo cat /etc/origin/master/ca-bundle.crt; echo x)"
echo "${RESULT%x}"
EOF
done

Solution

  • The problem is occurring because command and variable substitutions are performed in here-documents before they're passed to the command. Thus, $(sudo cat /etc/origin/master/ca-bundle.crt; echo x) and ${RESULT%x} are both evaluated on the local computer (where /etc/origin/master/ca-bundle.crt presumably doesn't exist and RESULT isn't defined). As a result, this is what gets sent to the remote computer:

        RESULT="x"
        echo ""
    

    Which isn't what you wanted at all.

    To avoid this, you can either escape the $ characters in the here-document, or quote the here-document delimiter (<<'EOF'). BTW, the indent characters in the here-document will be passed over the ssh connection, which can have unfortunate consequences (e.g. invoking the remote shell's autocomplete features). I'd either outdent the here-document, or prefix the delimiter with "-" (<<-'EOF') and use tabs (not spaces) for the indentation.

    EDIT: After thinking about @tripleee's answer, I'm not sure I agree with all of his suggested rewrites, but he's certainly right that parts of the loop are more complex and/or fragile than they should be. Here are my suggestions for changes:

    • Unless there's more being done on the remote computer than is shown in the question, the ssh part can (and should) be drastically simplified. There's no need to put the content of the certificate file in a variable and then echo the variable, just output it directly. And note that the entire problem arose because of the complications of handing getting it into and out of the variable; if the remote part was just sudo cat /etc/origin/master/ca-bundle.crt, the whole thing would've just worked. Furthermore, there's no need to explicitly run a remote bash shell and use a redirect to give it the command to run, just use ssh ${host} sudo cat /etc/origin/master/ca-bundle.crt.

      Ok, there is one thing the variable stuff did do. It added a blank line after the content of the certificate file. This happens because the "x" trick carefully preserves the final newline in the file's contents (normally, $( ) removes trailing newlines), and then echo adds another newline, resulting in a blank line. But if this was intentional, it's much easier (and clearer) to do just by adding another echo command with no arguments.

    The pipeline to list the hostnames is much longer than it needs to be, but here there's a tradeoff between complexity and readability/clarity. tripleee's version is much shorter, but it's harder to read off what it's doing (unless you're fluent in awk), and that means there's more risk of bugs, it's harder to maintain, etc. But there are still some simplifications I'd recommend:

    • Don't use cat to feed a single file into a pipeline. Either use a <filename redirect, or (if the command supports it) just give it the command the filename and let it read from it:

      cut -d ' ' -f 1 /etc/ansible/hosts | ...
      cut -d ' ' -f 1 </etc/ansible/hosts | ...
      

      This may seem less natural than using cat |, but it's the best (and most standard) way to do things. Frankly, if it seems unnatural or confusing, that just means you haven't done it enough. So do it more.

    • Your pipeline has each command doing just on thing: cut gets the first field, grep selects those that start with "master-", sort puts them in order, and uniq removes duplicates. But sort is perfectly capable of removing duplicates itself, so I'd use sort -u instead of sort | uniq. And awk can do almost everything all by itself (but gets hard to read/understand if you make it do too much). Personally, I'd use:

      awk '$1 ~ /^master-/ {print $1}' /etc/ansible/hosts | sort -u
      

      (That awk script can be read as "if the first field starts with 'master-', print it.)

    Then there's the question of how to iterate over the list of hosts. for var in $(somecommand) is widely considered an antipattern, because of the number of things that can go wrong. It breaks the output of the command into "words" (which might or might not correspond to lines), and then tries to expand any wildcards it finds into lists of matching files (which can have really weird effects), and then iterates over the result of that.

    In this particular case, I think you're safe (unless you redefined $IFS). There shouldn't be any word delimiters (spaces, tabs, and newlines) in your entries because you're already selecting just the first field. There shouldn't be any wildcards in the entries, because "*", "?", and "[" aren't generally allowed in hostnames. I suppose there might be a raw IPv6 address in "[hexdigits:hexdigits::hexdigits]" format -- which is a shell wildcard and will cause trouble if there are matching files -- but those aren't going to be selected by the ^master- pattern.

    The alternatives are to use pipe the list to either an xargs command (as in @tripleee's answer) or a while read loop. These both have a different potential problem in that ssh can steal hostnames from the input stream if you're not careful. ssh -n will prevent this.

    In this case, I think I'd just use the for loop. So with all of this, here's my suggested rewrite:

    #!/bin/bash
    set -u
    for host in $(awk '$1 !~ /^master-/ {print $1}' /etc/ansible/hosts | sort -u); do
        ssh "${host}" sudo cat /etc/origin/master/ca-bundle.crt
        echo
    done