Search code examples
dockerdocker-swarmtraefik

Docker swarm network not recognizing service/container on worker node. Using Traefik


I'm trying to test out a Traefik load balanced Docker Swarm and added a blank Apache service to the compose file.

For some reason I'm unable to place this Apache service on a worker node. I get a 502 bad gateway error unless it's on the manager node. Did I configure something wrong in the YML file?


networks:
  proxy:
    external: true

configs:
  traefik_toml_v2:
    file: $PWD/infra/traefik.toml

services:
  traefik:
    image: traefik:1.5-alpine
    deploy:
      replicas: 1
      update_config:
        parallelism: 1
        delay: 5s
      labels:
        - traefik.enable=true
        - traefik.docker.network=proxy
        - traefik.frontend.rule=Host:traefik.example.com
        - traefik.port=8080
        - traefik.backend.loadbalancer.sticky=true
        - traefik.frontend.passHostHeader=true
      placement:
        constraints:
          - node.role == manager
      restart_policy:
        condition: on-failure
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - $PWD/infra/acme.json:/acme.json
    networks:
      - proxy
    ports:
    - target: 80
      protocol: tcp
      published: 80
      mode: ingress
    - target: 443
      protocol: tcp
      published: 443
      mode: ingress
    - target: 8080
      protocol: tcp
      published: 8080
      mode: ingress
    configs:
    - source: traefik_toml_v2
      target: /etc/traefik/traefik.toml
      mode: 444
  server:
    image: bitnami/apache:latest
    networks:
      - proxy
    deploy:
      replicas: 1
      placement:
        constraints:
          - node.role == worker
      restart_policy:
        condition: on-failure
      labels:
        - traefik.enable=true
        - traefik.docker.network=proxy
        - traefik.port=80
        - traefik.backend=nerdmercs
        - traefik.backend.loadbalancer.swarm=true
        - traefik.backend.loadbalancer.sticky=true
        - traefik.frontend.passHostHeader=true
        - traefik.frontend.rule=Host:www.example.com

You'll see I've enabled swarm and everything

The proxy network is an overlay network and I'm able to see it in the worker node:

ubuntu@staging-worker1:~$ sudo docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
f91525416b42        bridge              bridge              local
7c3264136bcd        docker_gwbridge     bridge              local
7752e312e43f        host                host                local
epaziubbr9r1        ingress             overlay             swarm
4b50618f0eb4        none                null                local
qo4wmqsi12lc        proxy               overlay             swarm
ubuntu@staging-worker1:~$

And when I inspect that network ID

$ docker network inspect qo4wmqsi12lcvsqd1pqfq9jxj
[
    {
        "Name": "proxy",
        "Id": "qo4wmqsi12lcvsqd1pqfq9jxj",
        "Created": "2018-02-06T09:40:37.822595405Z",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.0.0/24",
                    "Gateway": "10.0.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "1860b30e97b7ea824ffc28319747b23b05c01b3fb11713fa5a2708321882bc5e": {
                "Name": "proxy_visualizer.1.dc0elaiyoe88s0mp5xn96ipw0",
                "EndpointID": "d6b70d4896ff906958c21afa443ae6c3b5b6950ea365553d8cc06104a6274276",
                "MacAddress": "02:42:0a:00:00:09",
                "IPv4Address": "10.0.0.9/24",
                "IPv6Address": ""
            },
            "3ad45d8197055f22f5ce629d896236419db71ff5661681e39c50869953892d4e": {
                "Name": "proxy_traefik.1.wvsg02fel9qricm3hs6pa78xz",
                "EndpointID": "e293f8c98795d0fdfff37be16861afe868e8d3077bbb24df4ecc4185adda1afb",
                "MacAddress": "02:42:0a:00:00:18",
                "IPv4Address": "10.0.0.24/24",
                "IPv6Address": ""
            },
            "735191796dd68da2da718ebb952b0a431ec8aa1718fe3be2880d8110862644a9": {
                "Name": "proxy_portainer.1.xkr5losjx9m5kolo8kjihznvr",
                "EndpointID": "de7ef4135e25939a2d8a10b9fd9bad42c544589684b30a9ded5acfa751f9c327",
                "MacAddress": "02:42:0a:00:00:07",
                "IPv4Address": "10.0.0.7/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4102"
        },
        "Labels": {},
        "Peers": [
            {
                "Name": "be4fb35c80f8",
                "IP": "manager IP"
            },
            {
                "Name": "4281cfd9ca73",
                "IP": "worker IP"
            }
        ]
    }
]

You'll see Traefik, Portainer, and Visualizer all present but not the apache container on the worker node

Inspecting the network on the worker node

$ sudo docker network inspect qo4wmqsi12lc
[
    {
        "Name": "proxy",
        "Id": "qo4wmqsi12lcvsqd1pqfq9jxj",
        "Created": "2018-02-06T19:53:29.104259115Z",
        "Scope": "swarm",
        "Driver": "overlay",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "10.0.0.0/24",
                    "Gateway": "10.0.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "c5725a332db5922a16b9a5e663424548a77ab44ab021e25dc124109e744b9794": {
                "Name": "example_site.1.pwqqddbhhg5tv0t3cysajj9ux",
                "EndpointID": "6866abe0ae2a64e7d04aa111adc8f2e35d876a62ad3d5190b121e055ef729182",
                "MacAddress": "02:42:0a:00:00:3c",
                "IPv4Address": "10.0.0.60/24",
                "IPv6Address": ""
            }
        },
        "Options": {
            "com.docker.network.driver.overlay.vxlanid_list": "4102"
        },
        "Labels": {},
        "Peers": [
            {
                "Name": "be4fb35c80f8",
                "IP": "manager IP"
            },
            {
                "Name": "4281cfd9ca73",
                "IP": "worker IP"
            }
        ]
    }
]

It shows up in the network's container list but the manager node containers are not there either.

Portainer is unable to see the apache site when it's on the worker node as well.


Solution

  • This problem is related to this: Creating new docker-machine instance always fails validating certs using openstack driver

    Basically the answer is

    It turns out my hosting service locked down everything other than 22, 80, and 443 on the Open Stack Security Group Rules. I had to add 2376 TCP Ingress for docker-machine's commands to work.

    It helps explain why docker-machine ssh worked but not docker-machine env

    should look at this https://docs.docker.com/datacenter/ucp/2.2/guides/admin/install/system-requirements/#ports-used and make sure they're all open