Google dataproc one node cluster, VCores Total = 8. I've tried from user spark:
/usr/lib/spark/sbin/start-thriftserver.sh --num-executors 2 --executor-cores 4
tried to change /usr/lib/spark/conf/spark-defaults.conf
tried to execute
export SPARK_WORKER_INSTANCES=6
export SPARK_WORKER_CORES=8
before start-thriftserver.sh
No success. In yarn UI I can see that thrift app use only 2 cores and 6 cores available.
UPDATE1: environment tab at spark ui:
spark.submit.deployMode client
spark.master yarn
spark.dynamicAllocation.minExecutors 6
spark.dynamicAllocation.maxExecutors 10000
spark.executor.cores 4
spark.executor.instances 1
It depends on what yarn mode is that app in.
Can be yarn client
- 1 core for Application Master(the app will be running on the machine where you ran command start-thriftserver.sh
).
In case of yarn cluster
- Driver will be inside AM container, so you can tweak cores with spark.driver.cores
. Other cores will be used by executors (1 executor = 1 core by default)
Beware that --num-executors 2 --executor-cores 4 wouldn't work as you have 8 cores max and +1 will be needed for AM container (total of 9)
You can check cores usage from Spark UI - http://sparkhistoryserverip:18080/history/application_1534847473069_0001/executors/
Options below are only for Spark standalone mode:
export SPARK_WORKER_INSTANCES=6
export SPARK_WORKER_CORES=8
Please review all configs here - Spark Configuration (latest)
In your case you can edit spark-defaults.conf and add:
spark.executor.cores 3
spark.executor.instances 2
Or use local[8] mode as you have only one node anyway.