Search code examples
dockernsqswarm

NSQ Docker Swarm


I'm trying to use NSQ in Docker Swarm without success

mhlg/rpi-nsq is a Docker image built for the Raspberry Pi ARM7 board and I can confirm is working correctly if run as a normal Docker container

Running NSQ in Docker (OK)

# crete a bridged network
$ docker network create nsq_network

# run lookupd
$ docker run --name nsqlookupd --network nsq_network -p 4160:4160 -p 4161:4161 mhlg/rpi-nsq nsqlookupd

# run nsqd
$ docker run --name nsqd --network nsq_network -p 4150:4150 -p 4151:4151 mhlg/rpi-nsq nsqd --broadcast-address=nsqd --lookupd-tcp-address=nsqlookupd:4160

# run nsqadmin
$ docker run --name nsqadmin --network nsq_network -p 4171:4171  mhlg/rpi-nsq nsqadmin --lookupd-http-address=nsqlookupd:4161

Running NSQ in Docker Swarm mode (FAIL) this is what I'm doing in the swarm manager

# crete an overlay network
$ docker network create nsq_network

# run nsqlookupd
$ docker service create --replicas 1 --name nsqlookupd --network nsq_network -p 4160:4160 -p 4161:4161 mhlg/rpi-nsq nsqlookupd

# run nsqd
$ docker service create --replicas 1 --name nsqd --network nsq_network -p 4150:4150 -p 4151:4151 mhlg/rpi-nsq nsqd --lookupd-tcp-address=nsqlookupd:4160 --broadcast-address=nsqd

# run nsqadmin
$ docker service create --replicas 1 --name nsqadmin --network nsq_network -p 4171:4171  mhlg/rpi-nsq nsqadmin --lookupd-http-address=nsqlookupd:4161

If I attach to the nsqd service I can see it is not able to connect to nsqlookupd service.

[nsqd] 2016/12/09 16:51:56.851953 LOOKUPD(nsqlookupd:4160): sending heartbeat
[nsqd] 2016/12/09 16:51:56.852049 LOOKUP connecting to nsqlookupd:4160
[nsqd] 2016/12/09 16:51:57.852457 LOOKUPD(nsqlookupd:4160): ERROR PING - dial tcp: i/o timeout

It looks like the overlay network create some issues (multicast?) but I can not figure how I can solve it especially on an ARM device.

I tried to ssh into the Docker Host running the nsqd service and exec some dns commands from inside the nsqd container

# resolve google.com (OK)
root@3206d1c3cd3d:/# nslookup google.com
Server:     127.0.0.11
Address:    127.0.0.11#53

Non-authoritative answer:
Name:   google.com
Address: 216.58.214.78

# resolve nsqd service (OK) - can resolve the container I'm executing the command from
root@e1f6430acd1c:/# nslookup nsqd
Server:     127.0.0.11
Address:    127.0.0.11#53

Non-authoritative answer:
Name:   nsqd
Address: 10.0.0.2

# resolve nsqlookupd service (FAIL)
root@e1f6430acd1c:/# nslookup nsqlookupd
;; connection timed out; no servers could be reached

Solution

  • Ran into the same exact issue in docker swarm. This is how I resolved it:

    docker service create \
    --mode global \
    --name swarm-master-nsq_nsqlookupd \
    --constraint node.role==manager \
    --hostname nsqlookupd \
    --network name=swarm-master-nsq_nsq,alias=nsqlookupd \
    nsqio/nsq:latest /nsqlookupd
    
    
    docker service create \
    --replicas 3 \
    --name swarm-master-nsq_nsqd \
    --constraint node.role==manager \
    --hostname nsqd \
    --network name=swarm-master-nsq_nsq,alias=nsqd \
    nsqio/nsq:latest sh -c '/nsqd --broadcast-address=$(hostname -i) --lookupd-tcp-address=nsqlookupd:4160'
    
    
    docker service create \
    --replicas 1 \
    --publish 4171:4171 \
    --name swarm-master-nsq_nsqadmin \
    --constraint node.role==manager \
    --hostname nsqadmin \
    --network name=swarm-master-nsq_nsq,alias=nsqadmin \
    nsqio/nsq:latest /nsqadmin --lookupd-http-address=nsqlookupd:4161
    

    As far as I can tell, there are a couple issues in your example:

    1. You are not aliasing nsqlookupd, and other services
    2. the broadcast of nsqd is incorrect (assuming you want to at some point increase the number of nsqd nodes)