Search code examples
javamavenhadoopapache-sparkhadoop-yarn

spark-submit through java code


I am trying spark-submit through Java code. I am referring the following example.

https://github.com/mahmoudparsian/data-algorithms-book/blob/master/misc/how-to-submit-spark-job-to-yarn-from-java-code.md

But I am getting

The constructor ClientArguments(String[], SparkConf) is undefined

This is my code.

import org.apache.spark.deploy.yarn.Client;
import org.apache.spark.deploy.yarn.ClientArguments;
import org.apache.hadoop.conf.Configuration;
import org.apache.spark.SparkConf;

public class SparkSubmitJava {
    public static void main(String[] arguments) throws Exception {
        String[] args = new String[] {"--name", "myname", "--jar", "/home/cloudera/Desktop/ScalaTest.jar", "--class", "ScalaTest.ScalaTest.ScalaTest", "--arg","3", "--arg", "yarn-cluster"};

        Configuration config = new Configuration();
        System.setProperty("SPARK_YARN_MODE", "true");
        SparkConf sparkConf = new SparkConf();
        ClientArguments cArgs = new ClientArguments(args, sparkConf);  // getting constructor error
        Client client = new Client(cArgs, config, sparkConf); // getting constructor error
        client.run();
    }
}

my pom.xml dependency section :

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-core_2.10</artifactId>
    <version>1.3.0</version>
</dependency>

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-yarn_2.9.3</artifactId>
    <version>0.8.1-incubating</version>
</dependency>

Ant help will be appreciated.


Solution

  • Considering what you have shared from your pom.xml, here is your problem : You are using a very old version of the spark-yarn library 0.8.1-incubating which you need to replace with the corresponding version to spark-core. Since you are using Spark 1.3, this is the dependency you'll be needing the following instead of the one you are using:

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-yarn_2.10</artifactId>
        <version>1.3.0</version>
    </dependency>
    

    Secondly you are using incompatible version of libraries scala-wise. Note that the _2.10 and _2.9.3 are very important. They allow you to use a specific scala compiled version of each dependency, so you should be careful to that.