Search code examples
parallel-processinggnuadministration

GNU parallel - running with proxy host


I've been recently dealing with GNU parallel and parallel computation. I have two machines available, one where I work and one behind a proxy machine, so I work at A, I can access C only via B and I can not compute on B.

My question is, how do I create the hosts file for GNU parallel to understand I want to share files with machine C? I have enabled password-less logins to both B and C from B.

Thank you!


Solution

  • From: https://www.gnu.org/software/parallel/man.html#EXAMPLE:-Using-remote-computers-behind-NAT-wall

    EXAMPLE: Using remote computers behind NAT wall

    If the workers are behind a NAT wall, you need some trickery to get to them.

    If you can ssh to a jump host, and reach the workers from there, then the obvious solution would be this, but it does not work:

    parallel --ssh 'ssh jumphost ssh' -S host1 echo ::: DOES NOT WORK
    

    It does not work because the command is dequoted by ssh twice where as GNU parallel only expects it to be dequoted once.

    So instead put this in ~/.ssh/config:

    Host host1 host2 host3
      ProxyCommand ssh jumphost.domain nc -w 1 %h 22
    

    It requires nc(netcat) to be installed on jumphost. With this you can simply:

    parallel -S host1,host2,host3 echo ::: This does work
    

    No jumphost, but port forwards

    If there is no jumphost but each server has port 22 forwarded from the firewall (e.g. the firewall's port 22001 = port 22 on host1, 22002 = host2, 22003 = host3) then you can use ~/.ssh/config:

    Host host1.v
        Port 22001
    Host host2.v
        Port 22002
    Host host3.v
        Port 22003
    Host *.v
        Hostname firewall
    

    And then use host{1..3}.v as normal hosts:

    parallel -S host1.v,host2.v,host3.v echo ::: a b c
    

    No jumphost, no port forwards

    If ports cannot be forwarded, you need some sort of VPN to traverse the NAT-wall. TOR is one options for that, as it is very easy to get working.

    You need to install TOR and setup a hidden service. In torrc put:

    HiddenServiceDir /var/lib/tor/hidden_service/
    HiddenServicePort 22 127.0.0.1:22
    

    Then start TOR: /etc/init.d/tor restart

    The TOR hostname is now in /var/lib/tor/hidden_service/hostname and is something similar to izjafdceobowklhz.onion. Now you simply prepend torsocks to ssh:

    parallel --ssh 'torsocks ssh' -S izjafdceobowklhz.onion \
        -S zfcdaeiojoklbwhz.onion,auclucjzobowklhi.onion echo ::: a b c
    

    If not all hosts are accessible through TOR:

    parallel -S 'torsocks ssh izjafdceobowklhz.onion,host2,host3' echo ::: a b c