Search code examples
dockernginxdocker-swarmhigh-availability

Nginx retry same end point on http_502 in Docker service Discovery


We use docker swarm with service discovery for Backend REST application. The services in swarm are configured with endpoint_mode: vip and are running in global mode. Nginx is proxy passed with service discovery aliases. When we update Backend services sometimes nginx throws 502 as service discovery may point to the updating service.

In such case, We wanted to retry the same endpoint again. How can we achieve this?

According to this we added upstream with the host's private IP and used proxy_next_upstream error timeout http_502; but still the problem persists.

nginx.conf

upstream servers {
    server 192.168.1.2:443; #private ip of host machine
    server 192.168.1.2:443 backup;
}

server {
    listen 443 ssl http2 default_server;
    listen [::]:443 ssl http2 default_server;
    proxy_next_upstream http_502;
    location /endpoint1 {
        proxy_pass http://docker.service1:8080/endpoint1;
    }
    location /endpoint2 {
        proxy_pass http://docker.service2:8080/endpoint2;
    }
    location /endpoint3 {
        proxy_pass http://docker.service3:8080/endpoint3;
    }
}

Here if http://docker.service1:8080/endpoint1 throws 502 we want to hit http://docker.service1:8080/endpoint1 again.

Additional queries:

  1. Is there any way in docker swarm to make it stop pointing to updating service in service discovery till that service is fully up?
  2. Is upstream necessary here since we directly use docker service discovery?

Solution

  • I suggest you add a health check directly at container level (here)

    By doing so, docker pings periodically an endpoint you specified, if it's found unhealthy it will 1) stop routing traffic to it 2) kill the container and restart a new one. Therefore you upstream will be resolved to one of the healthy containers. No need to retry.

    As for your additional questions, the first one, docker won't start routing til it's healthy. The second, nginx is still useful to distribute traffic according to endpoint url. But personally nginx + swarm vip mode is not a great choice because swarm load balancer is poorly documented, it doesn't support sticky session and you can't have proxy level health check, I would use traefik instead, it has its own load balancer.