Search code examples
dockermesosmarathon

Docker on Mesos: Volume is placed on which node?


I will be setting up a Mesos cluster to run single-use docker jobs, e.g. long rapidminer computations. Of course I want to get the result of the computation, so I think I should use Docker volumes for that.

Now, when I send a docker job to a cluster, specifying the volume for example in a JSON job file for Marathon or Chronos, where does the result of my computation land?

I am guessing that it is put into the respective directory on the slave node, but do I really have to go into the Mesos interface, look up which node executed my job, ssh into that node and copy my resulting file out? This seems very counterintuitive to the whole idea of Mesos of abstracting from single computers.

What would be the elegant solution for this scenario? I am very new to cluster management, so the only good solution I could think of was a distributed filesystem, although I don't know if this would be supported in the jobfile of Marathon or Chronos.


Solution

  • It is safe to say that Mesos assumes that all your final data is stored somewhere when you task finishes, and it's your, or if you want, your task's or your framework's responsibility to ensure this. If you want to persist intermediate results, or share results between tasks, you can look at persistent volumes, which are currently under development and will—hopefully—land in the next Mesos release. Be advised, that they are considered part of node resources and are not replicated, hence will be lost in case of node failure.

    As an alternative to distributed file system, you can modify your task so that it sends the result of the computation to a certain storage, e.g. a database, a ftp server, etc.