How could I configure from Java (or Scala) code amount of executors having SparkConfig
and SparkContext
? I see constantly 2 executors. Looks like spark.default.parallelism
does not work and is about something different.
I just need to set amount of executors to be equal to cluster size but there are always only 2 of them. I know my cluster size. I run on YARN if this matters.
OK, got it.
Number of executors is not actually Spark property itself but rather driver used to place job on YARN. So as I'm using SparkSubmit class as driver and it has appropriate --num-executors
parameter which is exactly what I need.
UPDATE:
For some jobs I don't follow SparkSubmit
approach anymore. I cannot do it primarily for applications where Spark job is only one of application component (and is even optional). For these cases I use spark-defaults.conf
attached to cluster configuration and spark.executor.instances
property inside it. This approach is much more universal allowing me to balance resources properly depending on cluster (developer workstation, staging, production).