Search code examples
apache-sparkhadoop-yarnspark-submit

Spark application override yarn-site.xml config parameters


I need to override one Yarn configuration parameter in yarn-site.xml when I submit a Spark application. Can I pass it as an extra param to spark-submit?

The parameter I want to override is yarn.nodemanager.vmem-check-enabled


Solution

  • You can use --conf while submitting the job with spark-submit

    --conf "yarn.nodemanager.vmem-check-enabled"
    

    Or you can also set inside your program with the code as SparkSession.conf.set

    From the doc

    Configuration for a Spark application. Used to set various Spark parameters as key-value pairs.

    Most of the time, you would create a SparkConf object with new SparkConf(), which will load values from any spark.* Java system properties set in your application as well. In this case, parameters you set directly on the SparkConf object take priority over system properties.

    For unit tests, you can also call new SparkConf(false) to skip loading external settings and get the same configuration no matter what the system properties are.

    All setter methods in this class support chaining. For example, you can write new SparkConf().setMaster("local").setAppName("My app").