Search code examples
apache-sparkapache-spark-standalone

How to specify custom conf file for Spark Standalone's master?


Every time I start Spark Standalone's master, I have to change a different set of configs (spark-env.sh) depending on an application. As of now I edit spark-env.sh every time I need to overwrite/change any variable in it.

Is there a way so that while executing sbin/start-master.sh I could pass the conf file externally?


Solution

  • Use --properties-file with the path to a custom Spark properties file. It defaults to $SPARK_HOME/conf/spark-defaults.conf.

    $ ./sbin/start-master.sh --help
    Usage: ./sbin/start-master.sh [options]
    
    Options:
      -i HOST, --ip HOST     Hostname to listen on (deprecated, please use --host or -h)
      -h HOST, --host HOST   Hostname to listen on
      -p PORT, --port PORT   Port to listen on (default: 7077)
      --webui-port PORT      Port for web UI (default: 8080)
      --properties-file FILE Path to a custom Spark properties file.
                             Default is conf/spark-defaults.conf.
    

    If however you want to set environment variables, you'd have to set them as you'd do with any other command-line application, e.g.

    SPARK_LOG_DIR=here-my-value ./sbin/start-master.sh
    

    One idea would be to use SPARK_CONF_DIR environment variable to point to a custom directory with the required configuration.

    From sbin/spark-daemon.sh (that is executed as part of start-master.sh):

    SPARK_CONF_DIR Alternate conf dir. Default is ${SPARK_HOME}/conf.

    So, use SPARK_CONF_DIR and save the custom configuration under conf.

    I've just noticed spark-daemon.sh script accepts --config <conf-dir> so it looks like you can use --config not SPARK_CONF_DIR env var.