DCOS cluster scaling

I have a DCOS cluster with 3 agent nodes; I have few services like spark running on DCOS.

If i scale my DCOS cluster, do i need to scale my spark as well (because if i add a 4th node to DCOS cluster and when i run a spark job, master may allocate resources for the spark job to be run on the 4th node where spark is not installed and hence it will fail)?

In my observation, i found that the jobs are being submitted to any node that Mesos master sees.

Is there a way where i can specify Spark job not run on certain nodes?

Solution

Not by default, so you will have to scale Spark. In this context by scale I refer to adding more executors. There is no need to perform any additional package installs to support this.

Dynamic allocation may help, but I've not used it:

http://spark.apache.org/docs/latest/running-on-mesos.html#dynamic-resource-allocation-with-mesos

http://spark.apache.org/docs/latest/job-scheduling.html#dynamic-resource-allocation

You can control where jobs run in Marathon, but to my knowledge not Spark, via DCOS. I think you will be able to achieve it in the underlying Mesos configuration, but it's not recommended. You can create multiple Spark 'clusters' within one DCOS cluster and choose which spark instance to submit to:

To install mutiple instances of the DC/OS Spark package, set each service.name to a unique name (e.g.: “spark-dev”) in your JSON configuration file during installation:

{
  "service": {
    "name": "spark-dev"
  }
}

To use a specific Spark instance from the DC/OS Spark CLI:

$ dcos config set spark.app_id <service.name>

https://docs.mesosphere.com/1.8/usage/service-guides/spark/install/