Search code examples
apache-sparkspark-streamingmesosmesosphere

Spark Streaming through Kafka receiver on Coarse Grain Mesos cluster


I have been prototyping Spark Streaming 1.6.1 using kafka receiver on a Mesos 0.28 cluster running with Coarse grained mode.

I have 6 mesos slaves each with 64GB RAM and 16 Cores.
My kafka topic has 3 partitions.
My goal is to launch 3 executors in all (each on a different mesos slave) with each executor having one kafka receiver reading from one kafka partition.

When I launch my spark application with spark.cores.max set to 24 and spark.executor.memory set to 8GB, I get two executors - with 16 cores on one slave and with 8 cores on another slave.

I am looking to get 3 executors with 8 cores each on three different slaves. Is that possible with mesos through resource reservation / isolation, constraints etc. ?

Only workaround that works for me now is to scale down each mesos slave node to only have 8 cores max. I don't want to use mesos in fine-grained mode for performance reasons and plus its support is going away soon.


Solution

  • Mesosphere has contributed the following patch to Spark: https://github.com/apache/spark/commit/80cb963ad963e26c3a7f8388bdd4ffd5e99aad1a. This improvement will land in Spark 2.0. Mesosphere has backported this and other improvements to Spark 1.6.1 and made it available in DC/OS (http://dcos.io).

    This patch introduces a new "spark.executor.cores" config variable in course gain mode. When the "spark.executor.cores" config variable is set, executors will be sized with the specified number of cores.

    If an offer arrives with a multiple of (spark.executor.memory, spark.executor.cores), multiple executors will be launched on that offer. This means there could be multiple, but seperate, Spark executors running on the same Mesos agent node.

    There is no way (currently) to spread the executors across N Mesos agents. We briefly discussed adding the ability to spread Spark executors across N Mesos agents but concluded it doesn't buy much in terms of improved availability.

    Can you help us understand your motivations for spreading Spark executors across 3 Mesos agents? It's likely we haven't considered all possibly use cases and advantages.

    Keith