Search code examples
cbashsshexec

The hard way to debug the mysterious git+ssh+proxy failure "bash: No such file or directory"


I'm trying to clone a github repo via a SOCKS5 proxy. In ~/.ssh/config I have:

Host github.com *.github.com
    ProxyCommand /usr/bin/nc -X 5 -x 127.0.0.1:7070 %h %p

git clone fails with error bash: No such file or directory:

$ git clone git@github.com:aureliojargas/sedsed.git
Cloning into 'sedsed'...
bash: No such file or directory
kex_exchange_identification: Connection closed by remote host
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.

I manually tried the ssh command and it also fails:

$ ssh -v git@github.com
OpenSSH_8.1p1, LibreSSL 2.7.3
debug1: Reading configuration data /Users/pynexj/.ssh/config
debug1: /Users/pynexj/.ssh/config line 16: Applying options for github.com
debug1: Reading configuration data /etc/ssh/ssh_config
debug1: /etc/ssh/ssh_config line 47: Applying options for *
debug1: Executing proxy command: exec /usr/bin/nc -X 5 -x 127.0.0.1:7070 github.com 22
                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
debug1: identity file /Users/pynexj/.ssh/id_rsa type 0
debug1: identity file /Users/pynexj/.ssh/id_rsa-cert type -1
debug1: identity file /Users/pynexj/.ssh/id_dsa type -1
debug1: identity file /Users/pynexj/.ssh/id_dsa-cert type -1
debug1: identity file /Users/pynexj/.ssh/id_ecdsa type -1
debug1: identity file /Users/pynexj/.ssh/id_ecdsa-cert type -1
bash: No such file or directory
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
debug1: identity file /Users/pynexj/.ssh/id_ed25519 type -1
debug1: identity file /Users/pynexj/.ssh/id_ed25519-cert type -1
debug1: identity file /Users/pynexj/.ssh/id_xmss type -1
debug1: identity file /Users/pynexj/.ssh/id_xmss-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_8.1
kex_exchange_identification: Connection closed by remote host

Then I manually tried the nc command and it actually works:

$ /usr/bin/nc -X 5 -x 127.0.0.1:7070 github.com 22
SSH-2.0-babeld-8cd15329
^C

And the SOCKS5 proxy works fine too:

$ curl -x socks5://127.0.0.1:7070/ https://github.com/ > foo.html
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  214k    0  214k    0     0  86775      0 --:--:--  0:00:02 --:--:-- 86775

I'm very curious about who (and why) produces the error bash: no such file or directory.


Solution

  • For me the issue is macOS specific. I Googled a lot and found many broken SSH cases on macOS 10.15 (Catalina) but none of the workarounds works for me. Eventually I had to take a look at the OpenSSH code and found out the issue.


    In source file sshconnect.c:

     194 static int
     195 ssh_proxy_connect(struct ssh *ssh, const char *host, const char *host_arg,
     196     u_short port, const char *proxy_command)
     197 {
     ...
     ...
     201     char *shell;
     202
     203     if ((shell = getenv("SHELL")) == NULL || *shell == '\0')
     204         shell = _PATH_BSHELL;
     ...
     ...
     211     command_string = expand_proxy_command(proxy_command, options.user,
     212         host, host_arg, port);
     213     debug("Executing proxy command: %.500s", command_string);
     214
     215     /* Fork and execute the proxy command. */
     216     if ((pid = fork()) == 0) {
     217         char *argv[10];
     ...
     ...
     240         argv[0] = shell;
     241         argv[1] = "-c";
     242         argv[2] = command_string;
     243         argv[3] = NULL;
     244
     245         /* Execute the proxy command.  Note that we gave up any
     246            extra privileges above. */
     247         ssh_signal(SIGPIPE, SIG_DFL);
     248         execv(argv[0], argv);
     249         perror(argv[0]);
     250         exit(1);
     251     }
    

    See line 203, 240 and 248, ssh is trying to run the ProxyCommand with $SHELL (I found no doc for this) and it's using execv() which would not search in $PATH. Then I checked my $SHELL:

    $ echo $SHELL
    bash
    

    So that's the problem. $SHELL is not a full pathname executable so execv() failed to execute it and the error bash: No such file or directory is from perror() in line 249. (The error confused me a lot. The prefix bash: made me think the error is from Bash.)

    SOLUTION: Manually set SHELL to the shell's full pathname, e.g. /bin/bash. (I did not write shell /bin/bash in .screenrc because I also has /usr/local/bin/bash.)


    Then who sets SHELL=bash? Why doesn't it set SHELL=/bin/bash?

    In my ~/.screenrc I have:

    shell bash
    

    According to screen manual:

    • shell command

      Set the command to be used to create a new shell. This overrides the value of the environment variable $SHELL.

    The SHELL var is initially /bin/bash in my interactive shell before I started screen so it's screen who set SHELL=bash. I think screen should find out the full pathname of the shell and set SHELL to the full pathname, because, according to posix:

    This variable shall represent a pathname of the user's preferred command language interpreter.


    Then why it works all fine on my Linux system (Debian) where I have SHELL=bash too (also in screen)?

    I did a strace and got this:

    $ SHELL=xxx strace -f ssh git@github.com
    [...]
    [pid  5767] rt_sigaction(SIGPIPE, NULL, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
    [pid  5767] execve("/root/bin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
    [pid  5767] execve("/usr/local/bin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
    [pid  5767] execve("/usr/local/sbin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
    [pid  5767] execve("/usr/sbin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
    [pid  5767] execve("/usr/bin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
    [pid  5767] execve("/sbin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
    [pid  5767] execve("/bin/xxx", ["xxx", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x561e33a599a0 /* 33 vars */) = -1 ENOENT (No such file or directory)
    [pid  5767] dup(2)                      = 3
    [pid  5767] fcntl(3, F_GETFL)           = 0x8402 (flags O_RDWR|O_APPEND|O_LARGEFILE)
    [pid  5767] fstat(3, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x21), ...}) = 0
    [pid  5767] write(3, "xxx: No such file or directory\n", 31xxx: No such file or directory
    ) = 31
    [pid  5767] close(3)                    = 0
    [...]
    

    As we can see, it's actually searching xxx in $PATH. Why? I guess Debian must have patched openssh and changed its behavior. (I would have verified this if I know Debian build internals. :-)


    UPDATE 2020-11-19:

    I manually compiled OpenSSH (v8.4) from source and reproduced the same issue on Debian. This confirms Debian has patched OpenSSH and changed its behavior.

    $ /usr/local/openssh-8.4/bin/ssh git@github.com
    bash: No such file or directory
    kex_exchange_identification: Connection closed by remote host
    $ strace -f /usr/local/openssh-8.4/bin/ssh git@github.com
    [...]
    [pid 21020] rt_sigaction(SIGPIPE, {sa_handler=SIG_DFL, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f19a05a9840}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
    [pid 21020] execve("bash", ["bash", "-c", "exec nc -X 5 -x 127.0.0.1:7070 g"...], 0x5566982872f0 /* 33 vars */) = -1 ENOENT (No such file or directory)
    [pid 21020] dup(2)                      = 3
    [pid 21020] fcntl(3, F_GETFL)           = 0x8402 (flags O_RDWR|O_APPEND|O_LARGEFILE)
    [pid 21020] fstat(3, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x25), ...}) = 0
    [pid 21020] write(3, "bash: No such file or directory\n", 32bash: No such file or directory
    ) = 32
    [pid 21020] close(3)
    [...]