Search code examples
apache-sparkhadoop-yarn

How to submit jobs to spark using yarn rest api? I want to use YARN REST API for submitting job to spark


I am building an interface for triggering spark-jobs and checking job status.

I cannot use 3rd party libraries like Livy, spark job server. I want to make APIs for starting and submitting jobs to spark cluster via REST API.


Solution

  • You can use spark jobserver - https://github.com/spark-jobserver/spark-jobserver

    Update -

    I didn't see the spark job server cannot be used, you can use the below

    Job Submission

    curl -X POST http://spark-cluster-ip:6066/v1/submissions/create --header "Content-Type:application/json;charset=UTF-8" --data '{
      "action" : "CreateSubmissionRequest",
      "appArgs" : [ "myAppArgument1" ],
      "appResource" : "file:/myfilepath/spark-job-1.0.jar",
      "clientSparkVersion" : "1.5.0",
      "environmentVariables" : {
        "SPARK_ENV_LOADED" : "1"
      },
      "mainClass" : "com.mycompany.MyJob",
      "sparkProperties" : {
        "spark.jars" : "file:/myfilepath/spark-job-1.0.jar",
        "spark.driver.supervise" : "false",
        "spark.app.name" : "MyJob",
        "spark.eventLog.enabled": "true",
        "spark.submit.deployMode" : "cluster",
        "spark.master" : "spark://spark-cluster-ip:6066"
      }
    }
    

    Job Status

    curl http://spark-cluster-ip:6066/v1/submissions/status/driver-20151008145126-0000