Search code examples
dockerudpdocker-composeconsul

Bidirectional UDP Port for docker container


I have a consul running in a docker container.
When I start another consul agent (not on docker), it says:

[WARN] memberlist: Was able to reach container_name via TCP but not UDP, network may be misconfigured and not allowing bidirectional UDP

I am trying to form a cluster here, but leader election keeps failing.
How can I fix this?

My port specification in docker-compose.yml (docker-compose version: 1)

  ports:
    - "8300:8300"
    - "8301:8301"
    - "8301:8301/udp"
    - "8302:8302"
    - "8302:8302/udp"
    - "8400:8400"
    - "8500:8500"
    - "8600:8600"
    - "8600:8600/udp"

Log of Consul1 running in Docker Container:

         Node name: '<host>'
        Datacenter: 'dc1'
            Server: true (bootstrap: true)
       Client Addr: 0.0.0.0 (HTTP: 8500, HTTPS: -1, DNS: 8600, RPC: 8400)
      Cluster Addr: <host_ip> (LAN: 8301, WAN: 8302)
    Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
             Atlas: <disabled>

==> Log data will now stream in as it occurs:

    2017/06/08 03:39:44 [INFO] raft: Restored from snapshot 13-35418-1496826625488
    2017/06/08 03:39:44 [INFO] serf: EventMemberJoin: <host> <host_ip>
    2017/06/08 03:39:44 [INFO] raft: Node at <host_ip>:8300 [Follower] entering Follower state
    2017/06/08 03:39:44 [INFO] consul: adding LAN server <host> (Addr: <host_ip>:8300) (DC: dc1)
    2017/06/08 03:39:44 [INFO] serf: EventMemberJoin: <host>.dc1 <host_ip>
    2017/06/08 03:39:44 [INFO] consul: adding WAN server <host>.dc1 (Addr: <host_ip>:8300) (DC: dc1)
    2017/06/08 03:39:44 [ERR] agent: failed to sync remote state: No cluster leader
    2017/06/08 03:39:45 [WARN] raft: Heartbeat timeout reached, starting election
    2017/06/08 03:39:45 [INFO] raft: Node at <host_ip>:8300 [Candidate] entering Candidate state
    2017/06/08 03:39:45 [INFO] raft: Election won. Tally: 1
    2017/06/08 03:39:45 [INFO] raft: Node at <host_ip>:8300 [Leader] entering Leader state
    2017/06/08 03:39:45 [INFO] consul: cluster leadership acquired
    2017/06/08 03:39:45 [INFO] consul: New leader elected: <host>
    2017/06/08 03:39:45 [INFO] raft: Disabling EnableSingleNode (bootstrap)
    2017/06/08 03:39:45 [INFO] raft: Added peer <host_ip>:9300, starting replication
    2017/06/08 03:39:45 [INFO] raft: Removed peer <host_ip>:9300, stopping replication (Index: 36201)
    2017/06/08 03:39:45 [INFO] raft: Added peer <host_ip>:9300, starting replication
    2017/06/08 03:39:45 [INFO] raft: Added peer <host_ip>:10300, starting replication
    2017/06/08 03:39:45 [INFO] raft: Removed peer <host_ip>:10300, stopping replication (Index: 36228)
    2017/06/08 03:39:45 [INFO] raft: Removed peer <host_ip>:9300, stopping replication (Index: 36230)
    2017/06/08 03:39:45 [ERR] raft: Failed to AppendEntries to <host_ip>:10300: dial tcp <host_ip>:10300: getsockopt: connection refused
    2017/06/08 03:39:45 [ERR] raft: Failed to AppendEntries to <host_ip>:9300: dial tcp <host_ip>:9300: getsockopt: connection refused
    2017/06/08 03:39:45 [ERR] raft: Failed to AppendEntries to <host_ip>:9300: dial tcp <host_ip>:9300: getsockopt: connection refused
    2017/06/08 03:39:45 [ERR] raft: Failed to AppendEntries to <host_ip>:10300: dial tcp <host_ip>:10300: getsockopt: connection refused
    2017/06/08 03:39:45 [ERR] raft: Failed to AppendEntries to <host_ip>:9300: dial tcp <host_ip>:9300: getsockopt: connection refused
    2017/06/08 03:39:49 [WARN] agent: Check 'vault::8200:vault-sealed-check' missed TTL, is now critical
    2017/06/08 03:39:50 [INFO] serf: EventMemberJoin: server2 <host_ip>
    2017/06/08 03:39:50 [INFO] consul: adding LAN server server2 (Addr: <host_ip>:9300) (DC: dc1)
    2017/06/08 03:39:50 [INFO] raft: Added peer <host_ip>:9300, starting replication
    2017/06/08 03:39:50 [WARN] raft: AppendEntries to <host_ip>:9300 rejected, sending older logs (next: 36231)
    2017/06/08 03:39:50 [INFO] raft: pipelining replication to peer <host_ip>:9300
    2017/06/08 03:39:50 [INFO] consul: member 'server2' joined, marking health alive
    2017/06/08 03:39:52 [INFO] agent: Synced service 'vault::8200'
    2017/06/08 03:39:52 [INFO] agent: Synced check 'vault::8200:vault-sealed-check'
    2017/06/08 03:40:06 [INFO] agent: Synced check 'vault::8200:vault-sealed-check'
    2017/06/08 03:40:18 [ERR] raft: Failed to heartbeat to <host_ip>:9300: EOF
    2017/06/08 03:40:18 [INFO] raft: aborting pipeline replication to peer <host_ip>:9300
    2017/06/08 03:40:19 [ERR] raft: Failed to AppendEntries to <host_ip>:9300: dial tcp <host_ip>:9300: getsockopt: connection refused
    2017/06/08 03:40:19 [ERR] raft: Failed to heartbeat to <host_ip>:9300: dial tcp <host_ip>:9300: getsockopt: connection refused
    2017/06/08 03:40:19 [ERR] raft: Failed to AppendEntries to <host_ip>:9300: dial tcp <host_ip>:9300: getsockopt: connection refused
    2017/06/08 03:40:19 [ERR] raft: Failed to heartbeat to <host_ip>:9300: dial tcp <host_ip>:9300: getsockopt: connection refused
    2017/06/08 03:40:19 [ERR] raft: Failed to AppendEntries to <host_ip>:9300: dial tcp <host_ip>:9300: getsockopt: connection refused
    2017/06/08 03:40:19 [ERR] raft: Failed to AppendEntries to <host_ip>:9300: dial tcp <host_ip>:9300: getsockopt: connection refused
    2017/06/08 03:40:19 [ERR] raft: Failed to heartbeat to <host_ip>:9300: dial tcp <host_ip>:9300: getsockopt: connection refused
    2017/06/08 03:40:19 [WARN] raft: Failed to contact <host_ip>:9300 in 501.593114ms
    2017/06/08 03:40:19 [WARN] raft: Failed to contact quorum of nodes, stepping down
    2017/06/08 03:40:19 [INFO] raft: Node at <host_ip>:8300 [Follower] entering Follower state
    2017/06/08 03:40:19 [INFO] consul: cluster leadership lost
    2017/06/08 03:40:19 [ERR] raft: Failed to AppendEntries to <host_ip>:9300: dial tcp <host_ip>:9300: getsockopt: connection refused
    2017/06/08 03:40:20 [WARN] raft: Heartbeat timeout reached, starting election
    2017/06/08 03:40:20 [INFO] raft: Node at <host_ip>:8300 [Candidate] entering Candidate state
    2017/06/08 03:40:20 [ERR] raft: Failed to make RequestVote RPC to <host_ip>:9300: dial tcp <host_ip>:9300: getsockopt: connection refused
    2017/06/08 03:40:21 [INFO] memberlist: Suspect server2 has failed, no acks received
    2017/06/08 03:40:22 [WARN] raft: Election timeout reached, restarting election
    2017/06/08 03:40:22 [INFO] raft: Node at <host_ip>:8300 [Candidate] entering Candidate state
    2017/06/08 03:40:22 [ERR] raft: Failed to make RequestVote RPC to <host_ip>:9300: dial tcp <host_ip>:9300: getsockopt: connection refused
    2017/06/08 03:40:23 [INFO] memberlist: Suspect server2 has failed, no acks received
    2017/06/08 03:40:23 [WARN] dns: Query results too stale, re-requesting
    2017/06/08 03:40:23 [ERR] dns: rpc error: No cluster leader
    2017/06/08 03:40:23 [WARN] raft: Election timeout reached, restarting election
    2017/06/08 03:40:23 [INFO] raft: Node at <host_ip>:8300 [Candidate] entering Candidate state
    2017/06/08 03:40:23 [ERR] raft: Failed to make RequestVote RPC to <host_ip>:9300: dial tcp <host_ip>:9300: getsockopt: connection refused
    2017/06/08 03:40:24 [WARN] raft: Election timeout reached, restarting election
    2017/06/08 03:40:24 [INFO] raft: Node at <host_ip>:8300 [Candidate] entering Candidate state
    2017/06/08 03:40:24 [ERR] raft: Failed to make RequestVote RPC to <host_ip>:9300: dial tcp <host_ip>:9300: getsockopt: connection refused
    2017/06/08 03:40:24 [ERR] http: Request PUT /v1/session/renew/8c4efe65-07c3-f93e-6679-f2bc95f8e92c, error: No cluster leader from=172.17.0.4:57031
    2017/06/08 03:40:25 [INFO] memberlist: Suspect server2 has failed, no acks received
    2017/06/08 03:40:25 [ERR] http: Request PUT /v1/session/renew/8c4efe65-07c3-f93e-6679-f2bc95f8e92c, error: No cluster leader from=172.17.0.4:57061
    2017/06/08 03:40:26 [INFO] memberlist: Suspect server2 has failed, no acks received
    2017/06/08 03:40:26 [INFO] memberlist: Marking server2 as failed, suspect timeout reached
    2017/06/08 03:40:26 [INFO] serf: EventMemberFailed: server2 <host_ip>
    2017/06/08 03:40:26 [INFO] consul: removing LAN server server2 (Addr: <host_ip>:9300) (DC: dc1)
    2017/06/08 03:40:26 [WARN] raft: Election timeout reached, restarting election
    2017/06/08 03:40:26 [INFO] raft: Node at <host_ip>:8300 [Candidate] entering Candidate state
    2017/06/08 03:40:26 [ERR] raft: Failed to make RequestVote RPC to <host_ip>:9300: dial tcp <host_ip>:9300: getsockopt: connection refused
    2017/06/08 03:40:26 [ERR] agent: coordinate update error: No cluster leader
    2017/06/08 03:40:26 [ERR] http: Request PUT /v1/session/renew/8c4efe65-07c3-f93e-6679-f2bc95f8e92c, error: No cluster leader from=172.17.0.4:57064
    2017/06/08 03:40:27 [WARN] dns: Query results too stale, re-requesting
    2017/06/08 03:40:27 [ERR] dns: rpc error: No cluster leader
    2017/06/08 03:40:27 [WARN] raft: Election timeout reached, restarting election

Log of consul2:

==> WARNING: Expect Mode enabled, expecting 2 servers
==> Starting Consul agent...
==> Starting Consul agent RPC...
==> Consul agent running!
         Node name: 'server2'
        Datacenter: 'dc1'
            Server: true (bootstrap: false)
       Client Addr: 0.0.0.0 (HTTP: 9500, HTTPS: -1, DNS: 9600, RPC: 9400)
      Cluster Addr: <host_ip> (LAN: 9301, WAN: 9302)
    Gossip encrypt: false, RPC-TLS: false, TLS-Incoming: false
             Atlas: <disabled>

==> Log data will now stream in as it occurs:

    2017/06/08 09:09:50 [INFO] raft: Restored from snapshot 13-35418-1496892834061
    2017/06/08 09:09:50 [INFO] serf: EventMemberJoin: server2 <host_ip>
    2017/06/08 09:09:50 [INFO] serf: EventMemberJoin: server2.dc1 <host_ip>
    2017/06/08 09:09:50 [INFO] raft: Node at <host_ip>:9300 [Follower] entering Follower state
    2017/06/08 09:09:50 [INFO] consul: adding LAN server server2 (Addr: <host_ip>:9300) (DC: dc1)
    2017/06/08 09:09:50 [INFO] consul: adding WAN server server2.dc1 (Addr: <host_ip>:9300) (DC: dc1)
    2017/06/08 09:09:50 [ERR] agent: failed to sync remote state: No cluster leader
    2017/06/08 09:09:50 [INFO] agent: Joining cluster...
    2017/06/08 09:09:50 [INFO] agent: (LAN) joining: [<host_ip>:8301 <host_ip>:10301]
    2017/06/08 09:09:50 [INFO] serf: EventMemberJoin: <host> <host_ip>
    2017/06/08 09:09:50 [INFO] consul: adding LAN server <host> (Addr: <host_ip>:8300) (DC: dc1)
    2017/06/08 09:09:50 [INFO] agent: (LAN) joined: 1 Err: <nil>
    2017/06/08 09:09:50 [INFO] agent: Join completed. Synced with 1 initial agents
    2017/06/08 09:09:50 [WARN] raft: Failed to get previous log: 36233 log not found (last: 36230)
    2017/06/08 09:09:50 [INFO] raft: Removed ourself, transitioning to follower
    2017/06/08 09:09:50 [INFO] raft: Removed ourself, transitioning to follower
    2017/06/08 09:09:52 [WARN] memberlist: Was able to reach <host> via TCP but not UDP, network may be misconfigured and not allowing bidirectional UDP
==> Newer Consul version available: 0.8.3
    2017/06/08 09:09:54 [WARN] memberlist: Was able to reach <host> via TCP but not UDP, network may be misconfigured and not allowing bidirectional UDP
    2017/06/08 09:09:56 [WARN] memberlist: Was able to reach <host> via TCP but not UDP, network may be misconfigured and not allowing bidirectional UDP
    2017/06/08 09:09:57 [WARN] memberlist: Was able to reach <host> via TCP but not UDP, network may be misconfigured and not allowing bidirectional UDP

Solution

  • What consul means regarding bidirectional UDP is that consul agent needs to see it's consul server and vice versa, consul server needs to see it's agent.

    Consul agent -- [UDP] --> Consul Server
    Consul agent <--[UDP] --  Consul Server
    

    They are two different communications, unlike TCP, which uses the same channel that the Agent already initiated.

    So, if your consul's agent and server are not in the same network (i.e. docker network) you need to expose ports in both ends. And take in account the concept of advertise that is the address that the agent announces to be contacted to.