I'm attempting the (potentially foolish) task of Dockerizing Zookeeper/Marathon/Mesos and deploying Docker containers from the Dockerized Mesos cluster.
So far, I have a working Mesos cluster on two physically separate nodes: one node is running both a Mesos master and a slave (container Dockerfiles linked), and the second node is running just a slave. They seem to be functioning just fine; I am able to submit very simple jobs through Marathon (also its own container, running on the node with the master and slave) and they complete successfully.
However, when I attempt to submit Docker containers through the Marathon API, it seems to hang. The Marathon interface hangs at "Deploying" and never changes, even after letting it sit for 15 minutes, stopping, resubmitting, and letting it sit for another 15 minutes.
At the same time, tasks are nonetheless being submitted to the Mesos slaves; the Mesos UI is reporting FAILED tasks left and right.
EDIT 1
The resulting Sandbox logs for each of the executors are also completely empty.
EDIT 2
Found something interesting buried in the slave logs:
Line of interest:
None of the enabled containerizers (mesos) could create a container for the provided TaskInfo/ExecutorInfo message.
It looks like the containerizer is failing to run, and from what I can see, it's not even considering docker as a containerizer. I followed the configuration here to deploy Docker jobs; does this change if the Mesos slaves are themselves Docker containers?
I'm somewhat out of my element and can't find any references along these lines. Any idea what's happening?
What's your docker run
command for the slave?
Here are a few parameters others have found useful:
--net host \ --pid host \ --privileged \ --env MESOS_CONTAINERIZERS=docker,mesos \ --env MESOS_EXECUTOR_REGISTRATION_TIMEOUT=5mins \ -v /var/run/docker.sock:/var/run/docker.sock \ -v /sys:/sys:ro \ -v /usr/bin/docker:/usr/bin/docker:ro \ -v /lib64/libdevmapper.so.1.02:/lib/libdevmapper.so.1.02:ro \ -v /home/core/.dockercfg:/root/.dockercfg:ro \
Also note that you shouldn't name the container mesos-slave
as the slave will try to remove any containers prefixed with mesos-
upon recovery.
FYI, Mesos uses the docker --version
command to see if the docker containerizer can be used. Try launching a Marathon task that just runs docker --version
to see if that would work inside your dockerized slave's environment.