Search code examples
dockerdocker-composedocker-swarm

Docker swarm replicas on different nodes


I'm using a docker-compose file version 3 with its deploy key to run a swarm (docker version 1.13) and I would like to replicate a service in order to make it resilient again single node failure.

However, when I'm adding a deploy section like this:

deploy:
    replicas: 2

in my four node cluster I sometimes end up with both replicas scheduled on the same node. What I'm missing is a constraint that schedules the two instances on different nodes.

I know that there's a global mode I could use but that would run an instance on every node, i.e. four instances in my case instead of just two.

Is there a simple way to specify this constraint in a generic way without having to resort to a combination of global and a labels to keep additional instances away from?

Edit: After trying it again I find containers to be scheduled on different nodes this time around. I'm beginning to wonder if I may have had a 'node.hostname == X' constraint in place.

Edit 2: After another service update - and without any placement constraints - the service is again being scheduled on the same node (as displayed by ManoMarks Visualizer):

enter image description here


Solution

  • docker/cli PR 1612 seems to resolve issue 26259, and has been released in docker 19.03.

    Added new switch --replicas-max-per-node switch to docker service

    How to verify it

    Create two services and specify --replicas-max-per-node one of them:

    docker service create --detach=true --name web1 --replicas 2 nginx
    docker service create --detach=true --name web2 --replicas 2 --replicas-max-per-node 1 nginx
    

    See difference on command outputs:

    $ docker service ls
    ID                  NAME                MODE                REPLICAS               IMAGE               PORTS
    0inbv7q148nn        web1                replicated          2/2                    nginx:latest        
    9kry59rk4ecr        web2                replicated          1/2 (max 1 per node)   nginx:latest
    
    $ docker service ps --no-trunc web2
    ID                          NAME                IMAGE                                                                                  NODE                DESIRED STATE       CURRENT STATE            ERROR                                                     PORTS
    bf90bhy72o2ry2pj50xh24cfp   web2.1              nginx:latest@sha256:b543f6d0983fbc25b9874e22f4fe257a567111da96fd1d8f1b44315f1236398c   limint              Running             Running 34 seconds ago                                                             
    xedop9dwtilok0r56w4g7h5jm   web2.2              nginx:latest@sha256:b543f6d0983fbc25b9874e22f4fe257a567111da96fd1d8f1b44315f1236398c                       Running             Pending 35 seconds ago   "no suitable node (max replicas per node limit exceed)"   
    

    The error message would be:

    no suitable node (max replicas per node limit exceed)
    

    Examples from Sebastiaan van Stijn:

    Create a service with max 2 replicas:

    docker service create --replicas=2 --replicas-max-per-node=2 --name test nginx:alpine
    docker service inspect --format '{{.Spec.TaskTemplate.Placement.MaxReplicas}}' test
    2
    

    Update the service (max replicas should keep its value)

    docker service update --replicas=1 test
    docker service inspect --format '{{.Spec.TaskTemplate.Placement.MaxReplicas}}' test
    2
    

    Update the max replicas to 1:

    docker service update --replicas-max-per-node=1 test
    docker service inspect --format '{{.Spec.TaskTemplate.Placement.MaxReplicas}}' test
    1
    

    And reset to 0:

    docker service update --replicas-max-per-node=0 test
    docker service inspect --format '{{.Spec.TaskTemplate.Placement.MaxReplicas}}' test
    0