Search code examples
spark-submit

can spark-submit be used as a job scheduler?


I have a spark standalone cluster with no other job scheduler installed. I wonder if spark-submit can be used as a job scheduler for both spark and non-spark jobs (e.g. a scala jar not written for Spark and not using RDD)?

Based on my testing, spark-submit be used to submit non-Spark jobs and the jobs run successfully. But here are my questions:

  1. Are the the following options still meaningful? --driver-cores --driver-memory --executor-memory --total-executor-cores
  2. If No to 1, does it mean spark-submit can maintain a queue of spark and non-spark jobs using FIFO but it does not manage the resource of the non-spark job?
  3. If 2 is true, should I use another scheduler, e.g. SGE or LSF, to submit non-spark jobs?

Thanks!


Solution

  • I figured out after many testings. Yes, spark standalone can be a job scheduler for both spark and non-spark jobs.

    1. However, for non-spark jobs, spark-submit only creates drivers, no executors.
    2. The jobs are scheduled in a FIFO queue, and jobs at the head of the queue get started only when its resource requirement, e.g. cores and memory specified in the spark-submit command, are met.