Search code examples
javamesosignitemarathondcos

Apache Ignite on on DC/OS marathon (or any other java app)


I've been trying to configure Apache Ignite on DC/OS (1.8.7) marathon using the official docs at http://apacheignite.gridgain.org/docs/mesos-deployment but short of some hacks I haven't been able to get it to work following the docs. One of the core reasons appear to be that the cmd

"cmd": "java -jar ignite-mesos-1.8.0.jar"

will through an error "sh: java: command not found". This would indicate that java is not in the path but on the marathon hosts I've validated that java is in fact accessible on the path for my regular user at least.

I suspect that somehow java needs to be added to the path of mesos-container that is trying to run the cmd but I've been unable to find any documentation on how to set the path or default environment variables (ignite-mesos spawns tasks that need JAVA_HOME set as well, which is also missing in the tasks) in the containers that get created. For reference my marathon.json file is below...

{
  "id": "/ignition",
  "cmd": "java -jar ignite-mesos-1.8.0.jar",
  "args": null,
  "user": null,
  "env": {
    "IGNITE_MEMORY_PER_NODE": "2048",
    "IGNITE_NODE_COUNT": "3",
    "IGNITE_VERSION": "1.8.0",
    "MESOS_MASTER_URL": "zk://master.mesos:2181/mesos",
    "IGNITE_RUN_CPU_PER_NODE": "0.1"
  },
  "instances": 0,
  "cpus": 0.25,
  "mem": 2048,
  "disk": 0,
  "gpus": 0,
  "executor": null,
  "constraints": null,
  "fetch": [
    {
      "uri": "http://SERVER_HERE/ignite-mesos-1.8.0.jar"
    }
  ],
  "storeUrls": null,
  "backoffSeconds": 1,
  "backoffFactor": 1.15,
  "maxLaunchDelaySeconds": 3600,
  "container": null,
  "healthChecks": null,
  "readinessChecks": null,
  "dependencies": null,
  "upgradeStrategy": {
    "minimumHealthCapacity": 1,
    "maximumOverCapacity": 1
  },
  "labels": {
    "HAPROXY_GROUP": "external"
  },
  "acceptedResourceRoles": null,
  "ipAddress": null,
  "residency": null,
  "secrets": null,
  "taskKillGracePeriodSeconds": null,
  "portDefinitions": [
    {
      "protocol": "tcp",
      "port": 10108
    }
  ],
  "requirePorts": false
}

Solution

  • Ignite seems to expect a JDK 1.7/1.8 installation on each agent node, and the JAVA_HOME environment variable set accordingly.

    Unfortunately, the Mesos framework doesn't seem to be well-maintained, as it still uses Mesos 0.22 libraries.