I have a spark job that runs on a cluster with dynamic resource allocation enabled.. I submit the spark job with num executors and the executor memory properties.. what will take precedence here? Will the job run with dynamic allocation or with the resources that I mention in the config?
It depends on which config parameter has a greater value ...
spark.dynamicAllocation.initialExecutors
or spark.executor.instances
aka --num-executors
(when launching via terminal at runtime)
Here is the reference doc if you are using Cloudera on YARN and make sure you are looking at the correct CDH version according to your environment.
Apache YARN documentation too:
https://spark.apache.org/docs/latest/configuration.html#dynamic-allocation
So to sum it up if you are using --num-executors
it is most likely overriding (cancelling and not using) dynamic allocation unless you set spark.dynamicAllocation.initialExecutors
to be a higher value.