Search code examples
amazon-web-servicesdockeramazon-ec2mesosmarathon

Mesos/Marathon on EC2 agents hostnames not accessible


I am setting up a Mesos/Marathon cluster on EC2 amazon with one master node and two agents. The installation is successful and when looking at :mesos-port the agents are listed correctly.

The Host is registered by the private DNS (ip-17*---.ec2.internal).

When I try to launch a docker image (tutum/hello-world) through the Marathon webui the deployment fails.

In the Mesos UI the completed tasks list will show the failed deployments attempts. Under the Sandbox link it states:

Failed to connect to agent '12136c28-93e7-4642-a5b6-c5e9a55eedd1-S0' on 'ip-17*-**-*-***.ec2.internal:5051'.
Potential reasons:
The agent's hostname, 'ip-17*-**-*-***.ec2.internal', is not accessible from your network

The agent's port, '5051', is not accessible from your network The agent timed out or went offline

I opened the port range completely in the safetygroup and I can ping from the master to the agents.

I added the private ip into the /etc/hosts file to be safe but that also does nothing.

Any ideas?


Solution

  • I have done this a long time ago so i donot remember the paths exactly.

    In Slave Go to /etc/mesos-slave folder (create if missing) and create two files as follows:

    1) Set containerizers file with (“mesos,docker”) in it.

    2) Set Execution_time_out file with (“5mins”) in it.

    Refer: https://mesosphere.github.io/marathon/docs/native-docker.html https://mesosphere.github.io/marathon/docs/troubleshooting.html

    Now restart your master and slaves.

    Also, you need to open up all the ports in your security groups. You can open All Traffic for testing (Not Recommended)

    Done!