Search code examples

SparkLauncher Run spark-submit with yarn-client with user as hive

Trying to run spark job with masterURL=yarn-client. Using SparkLauncher 2.10. The java code is wrapped in nifi processor. Nifi is currently running as root. When I do yarn application -list, I see the spark job started with USER = root. I want to run it with USER = hive. Following is my SparkLauncher code.

Process spark = new SparkLauncher()
    //   .setConf(SparkLauncher.DRIVER_EXTRA_JAVA_OPTIONS,"")
    .setConf(SparkLauncher.DRIVER_EXTRA_JAVA_OPTIONS, "-Dlog4j.configuration=file:///opt/eim/")

Do I need to pass user as driver extra options? Environment is non-kerberos. Read somewhere that I need to pass user name as driver extra java option. Cannot find that post now!!


  • export HADOOP_USER_NAME=hive worked. SparkLauncher has overload to accept Map of environment variables. As for spark.yarn.principle, the environment is non-kerberos. As per my reading yarn.principle works only with kerboros. Did the following

    Process spark = new SparkLauncher(getEnvironmentVar(ps.getRunAs()))
                            //   .setConf(SparkLauncher.DRIVER_EXTRA_JAVA_OPTIONS,"")
                            .setConf(SparkLauncher.DRIVER_EXTRA_JAVA_OPTIONS, "-Dlog4j.configuration=file:///opt/eim/")

    Instead of new SparkLancher() used SparkLauncher(java.util.Map<String,String> env).Added or replacedHADOOP_USER_NAME=hive. Checked yarn application -listlaunches as intended withUSER=hive.