I'm trying to setup a HA docker cluster on 3 dedicated pc's. I've successfully followed the instructions on docs.docker.com/engine/installation/linux/ubuntulinux and now I'm trying to follow the instructions on https://docs.docker.com/swarm/install-manual. Since I'm not using any virtualization I start at "Set up an consul discovery backend". The PC's (running ubuntu trusty 14.04 server edition) are all in the LAN 192.168.2.0/24. ubuntu001 has .104, ubuntu002 has .106, and ubuntu003 has .105
I did the following according to the instructions:
arnolde@ubuntu001:~$ docker run -d -p 8500:8500 --name=consul progrium/consul -server -bootstrap
arnolde@ubuntu001:~$ docker run -d -p 4000:4000 swarm manage -H :4000 --replication --advertise 192.168.2.104:4000 consul://192.168.2.104
arnolde@ubuntu002:~# docker run -d swarm manage -H :4000 --replication --advertise 192.168.2.106:4000 consul://192.168.2.104:8500
arnolde@ubuntu003:~$ docker run -d swarm join --advertise=192.168.2.105:2375 consul://192.168.2.104:8500
But then when trying the next step, the swarm manager does NOT show up as "Primary" like it says it should, and no primary is listed:
arnolde@ubuntu001:~$ docker -H :4000 info
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: swarm/1.1.0
Role: replica
Primary:
Strategy: spread
Filters: health, port, dependency, affinity, constraint
Nodes: 0
Plugins:
Volume:
Network:
Kernel Version: 3.19.0-25-generic
Operating System: linux
Architecture: amd64
CPUs: 0
Total Memory: 0 B
And: arnolde@ubuntu001:~$ docker -H :4000 run hello-world docker: Error response from daemon: No elected primary cluster manager.
I searched and found https://github.com/docker/swarm/issues/1491 which recommends to use dockerswarm/swarm:master instead, which I did, but it didn't help:
arnolde@ubuntu001:~$ docker run -d -p 4000:4000 dockerswarm/swarm:master manage -H :4000 --replication --advertise 192.168.2.104:4000 consul://192.168.2.104
I didn't find any other input regarding swarm+consul+primary so here I am... any suggestions? Unfortunately I'm not sure how to troubleshoot since I don't even know where to look for logging/debugging info, i.e. if the manager is connecting to consul successfully etc...
I was able to solve it myself after explicitly adding the port number to the consul:// parameter, apparently the docker docs are incomplete:
arnolde@ubuntu001:~$ docker run -d -p 4000:4000 dockerswarm/swarm:master manage -H :4000 --replication --advertise 192.168.2.104:4000 consul://192.168.2.104:8500
arnolde@ubuntu001:~$ docker -H :4000 info
Containers: 0
Running: 0
Paused: 0
Stopped: 0
Images: 0
Server Version: swarm/1.1.0
Role: replica
Primary: 192.168.2.106:4000
Also I added "-p 4000:4000" to the command on the replica manager (on ubuntu002). Not sure if that was necessary (or even a good idea).