I am setting up a Mesos/Marathon cluster on EC2 amazon with one master node and two agents. The installation is successful and when looking at :mesos-port the agents are listed correctly.
The Host is registered by the private DNS (ip-17*---.ec2.internal).
When I try to launch a docker image (tutum/hello-world) through the Marathon webui the deployment fails.
In the Mesos UI the completed tasks list will show the failed deployments attempts. Under the Sandbox link it states:
Failed to connect to agent '12136c28-93e7-4642-a5b6-c5e9a55eedd1-S0' on 'ip-17*-**-*-***.ec2.internal:5051'.
Potential reasons:
The agent's hostname, 'ip-17*-**-*-***.ec2.internal', is not accessible from your network
The agent's port, '5051', is not accessible from your network The agent timed out or went offline
I opened the port range completely in the safetygroup and I can ping from the master to the agents.
I added the private ip into the /etc/hosts file to be safe but that also does nothing.
Any ideas?
I have done this a long time ago so i donot remember the paths exactly.
In Slave Go to /etc/mesos-slave folder (create if missing) and create two files as follows:
1) Set containerizers file with (“mesos,docker”) in it.
2) Set Execution_time_out file with (“5mins”) in it.
Refer: https://mesosphere.github.io/marathon/docs/native-docker.html https://mesosphere.github.io/marathon/docs/troubleshooting.html
Now restart your master and slaves.
Also, you need to open up all the ports in your security groups. You can open All Traffic for testing (Not Recommended)
Done!