Search code examples
amazon-web-servicesapache-sparkemr

How to make the slave nodes work for Spark cluster using EMR?


I tried to run a job on my Spark cluster using EMR. The cluster has one master and two slaves, and each node (master or slave node) has 32 cores. The job was using "Add Step" through the console, the configuration is set below:

sparkConf.setMaster("local[24]").set("spark.executor.memory", "40g") .set("spark.driver.memory", "40g");

Then I noticed that the two slaves didn't work (CPU usage close to 0), only master was working hard. How do I fix this problem and make the slaves work?

Thanks!


Solution

  • When you specify a 'local' master that means the master is local - it is not distributed over the nodes.

    You should follow the doc: http://spark.apache.org/docs/1.2.0/spark-standalone.html