Search code examples
bashdockerdocker-composeevaldocker-machine

eval$(docker-machine env <machine name>) appears to work in shell script, but fails immediately


To set up a swarm of VM's for Docker on macOS High Sierra (with Docker Desktop 2.0.5.0, but this has happened with past versions.) I run the following shell script:

#!/bin/bash

# some variables
MANAGER="swarmMGR"
WORKER="swarmWKR"
MAXNODE=3

# create VMs for swarm
# manager
docker-machine create --driver virtualbox $MANAGER
# workers
for (( i=1; i<=$MAXNODE; i++ ))
do
    docker-machine create --driver virtualbox $WORKER$i
done

# find the manager's IP address
MANAGERIP=$(docker-machine ls | grep $MANAGER | egrep -o '([0-9]{1,3}[.]){3}[0-9]{1,3}')

# initialize the swarm
docker-machine ssh ${MANAGER} "docker swarm init --advertise-addr $MANAGERIP"

# workers join the swarm
# get the token
TOKEN=$(docker-machine ssh $MANAGER "docker swarm join-token worker -q")
# join up
for (( i=1; i<=$MAXNODE; i++ ))
do
    echo 
    docker-machine ssh $WORKER$i "docker swarm join --token ${TOKEN} ${MANAGERIP}"
done

# configure the shell to expose the manager for Docker commands from the host
eval $(docker-machine env $MANAGER)

# list the machines
docker-machine ls

# list the nodes
docker-machine ssh $MANAGER "docker node ls"

The eval $(docker-machine env $MANAGER) appears to work - here is the output at the end of the script:

NAME        ACTIVE   DRIVER       STATE     URL                         SWARM   DOCKER     ERRORS
swarmMGR    *        virtualbox   Running   tcp://192.168.99.197:2376           v18.09.6
swarmWKR1   -        virtualbox   Running   tcp://192.168.99.198:2376           v18.09.6
swarmWKR2   -        virtualbox   Running   tcp://192.168.99.199:2376           v18.09.6
swarmWKR3   -        virtualbox   Running   tcp://192.168.99.200:2376           v18.09.6
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
ub2qbyd6ca8yzgsq1f0mjsp0i *   swarmMGR            Ready               Active              Leader              18.09.6
dopxb7bgyixqyh3z66rvtik2o     swarmWKR1           Ready               Active                                  18.09.6
vwbd11l36idnphdsoutls2hvp     swarmWKR2           Ready               Active                                  18.09.6
oe80ejzvnhsmhosvjvus6cvb1     swarmWKR3           Ready               Active                                  18.09.6

If I execute a simple command, say docker node ls, which should now be operating on the swarm manager I get the following error:

Error response from daemon: This node is not a swarm manager. Use "docker swarm init" or "docker swarm join" to connect this node to swarm and try again.

If I manually run the eval() command the response becomes correct. I have done a lot of research and cannot find a solution to the issue.

Is there something which has to be done when running the eval() command from the script in order to get it to work properly?

Additional Information

I ran the portion of the shell script in question with the -x option to output some debugging info:

#!/bin/bash

MANAGER="swarmMGR"

# configure the shell to expose the manager for Docker commands from the host
eval $(docker-machine env "$MANAGER")

# list the machines
docker-machine ls

# list the nodes
docker-machine ssh $MANAGER "docker node ls"

echo $DOCKER_HOST

Here is the output:

$ bash -x ./export.sh
+ MANAGER=swarmMGR
++ docker-machine env swarmMGR
+ eval export 'DOCKER_TLS_VERIFY="1"' export 'DOCKER_HOST="tcp://192.168.99.228:2376"' export 'DOCKER_CERT_PATH="/Users/foo/.docker/machine/machines/swarmMGR"' export 'DOCKER_MACHINE_NAME="swarmMGR"' '#' Run this command to configure your shell: '#' eval '$(docker-machine' env 'swarmMGR)'
++ export DOCKER_TLS_VERIFY=1 export DOCKER_HOST=tcp://192.168.99.228:2376 export DOCKER_CERT_PATH=/Users/foo/.docker/machine/machines/swarmMGR export DOCKER_MACHINE_NAME=swarmMGR
++ DOCKER_TLS_VERIFY=1
++ DOCKER_HOST=tcp://192.168.99.228:2376
++ DOCKER_CERT_PATH=/Users/jblanchard/.docker/machine/machines/swarmMGR
++ DOCKER_MACHINE_NAME=swarmMGR
+ docker-machine ls
NAME        ACTIVE   DRIVER       STATE     URL                         SWARM   DOCKER     ERRORS
swarmMGR    *        virtualbox   Running   tcp://192.168.99.228:2376           v18.09.6
swarmWKR1   -        virtualbox   Running   tcp://192.168.99.229:2376           Unknown    Unable to query docker version: Cannot connect to the docker engine endpoint
swarmWKR2   -        virtualbox   Running   tcp://192.168.99.230:2376           v18.09.6
swarmWKR3   -        virtualbox   Running   tcp://192.168.99.231:2376           v18.09.6
+ docker-machine ssh swarmMGR 'docker node ls'
ID                            HOSTNAME            STATUS              AVAILABILITY        MANAGER STATUS      ENGINE VERSION
ryplgfjntuj7rfoapbl0bb8nj *   swarmMGR            Ready               Active              Leader              18.09.6
n89x4phb9jhilwz74zgta352r     swarmWKR2           Ready               Active                                  18.09.6
sp76b5vws2fcdpjuhxcqhywh3     swarmWKR3           Ready               Active                                  18.09.6
+ echo tcp://192.168.99.228:2376
tcp://192.168.99.228:2376

As long as the script is 'active' the environment variables are properly set. When the script exits, the variables are then unset as running echo $DOCKER_HOST immediately afterward from the command line yields no results.


Solution

  • As I continued to research the problem I finally found an answer in this question on AskUbuntu. It turns out the environment variables exported during the running of the script do not persist. To correct that all I needed to do was run the script using source which causes the exported environment variables to persist until unset:

    source ./create_swarm.sh
    

    or, using the shortcut

    . ./create_swarm.sh