Search code examples
amazon-web-servicesdockerelasticsearchamazon-ec2docker-swarm

AWS EC2 issue with Docker Swarm using dnsrr to setup an ElasticSearch cluster discovery


First of all, the issue is not specific to ElasticSearch, I think (so as not to discourage some potential answers).

I'm using a docker service with dnsrr (DNS round-robin) to allow for the discovery of every node in the cluster: they always try the hostname 'elastic' and (should) get different IPs.

This works perfectly fine when I create 3 VMs on my local machine, but I can't figure out why when I run it on 3 EC2 machines, the one configured as swarm leader only tries its own IP, while the two workers discover each other without issue.

I'm fairly new to AWS, so I guess it must be some kind of misconfiguration somewhere, but I can't figure out what to check.

Thanks in advance if you have any idea as to what may cause this, and even better if you come up with a solution!

The docker compose file as use is as below, simplified to the max to isolate the problem.

version: "3.3"

services:

  elastic:
    image: docker.elastic.co/elasticsearch/elasticsearch:5.5.2
    environment:
    - ES_JAVA_OPTS=-Xms1g -Xmx1g
    - discovery.zen.ping.unicast.hosts=elastic
    - discovery.zen.minimum_master_nodes=2
    volumes:
    - elastic_data:/usr/share/elasticsearch/data
    networks:
    - overnet
    logging:
      driver: "json-file"
      options: 
        max-size: "20m"
        max-file: "10"
    deploy:
      mode: global
      endpoint_mode: dnsrr

networks:
  overnet:
    driver: overlay
    driver_opts:
      encrypted: "true"

volumes:
    elastic_data:
      external: true

Solution

  • Try re-creating without encryption enabled to see if that works.

    Also be sure you have a Security group between the three nodes with all the proper ports open between them.