Search code examples
apache-sparkpysparkspark-submit

Spark: How to set spark.yarn.executor.memoryOverhead property in spark-submit


In Spark 2.0. How do you set the spark.yarn.executor.memoryOverhead when you run spark submit.

I know for things like spark.executor.cores you can set --executor-cores 2. Is it the same pattern for this property? e.g. --yarn-executor-memoryOverhead 4096


Solution

  • Please find example. The values can also be given in Sparkconf.

    Example:

    ./bin/spark-submit \
    --[your class] \
    --master yarn \
    --deploy-mode cluster \
    --num-exectors 17
    --conf spark.yarn.executor.memoryOverhead=4096 \
    --executor-memory 35G \  //Amount of memory to use per executor process 
    --conf spark.yarn.driver.memoryOverhead=4096 \
    --driver-memory 35G \   //Amount of memory to be used for the driver process
    --executor-cores 5
    --driver-cores 5 \     //number of cores to use for the driver process 
    --conf spark.default.parallelism=170
     /path/to/examples.jar