Search code examples
apache-sparkhadoophadoop-yarn

Multiple Spark Applications at same time , same Jarfile... Jobs are in waiting status


Spark/Scala noob here.

I am running spark in a clustered environment. I have two very similar apps (each with unique spark config and context). When I try and kick them both off the first seems to grab all resources and the second will wait to grab resources. I am setting resource on the submit but it doesn't seem to matter. Each node has 24 cores and 45 gb memory available for use. Here are the two commands I am using to submit that I want to run in parallel.

./bin/spark-submit --master spark://MASTER:6066 --class MainAggregator --conf spark.driver.memory=10g --conf spark.executor.memory=10g --executor-cores 3 --num-executors 5 sparkapp_2.11-0.1.jar -new

./bin/spark-submit --master spark://MASTER:6066 --class BackAggregator --conf spark.driver.memory=5g --conf spark.executor.memory=5g --executor-cores 3 --num-executors 5 sparkapp_2.11-0.1.jar 01/22/2020 01/23/2020

Also I should note that the second App does kick off but in the master monitoring webpage I see it as "Waiting" and it will have 0 cores until the first is done. The apps do pull from the same tables, but will have much different data chunks that they are pulling so the RDD/Dataframes are unique if that makes a difference.

What am I missing in order to run these at the same time?


Solution

  • second App does kick off but in the master monitoring webpage I see it as "Waiting" and it will have 0 cores until the first is done.


    I encountered the same thing some time back. Here there are 2 things..

    May be these are the causes.

    1) You dont have proper infrastructure.

    2) You might have used capacity scheduler which doesnt have preemptive machanism to accommodate new jobs until it.

    If it is #1 then you have to increase more nodes allocate more resouces using your spark-submit.

    If it is #2 Then you can adopt hadoop fair schedular where you can maintain 2 pools see spark documentation on this advantage would be you can run parllel jobs Fair will take care by pre-empting some resouces and allocate to another job which is running parllely.

    • mainpool for the first spark job..
    • backlogpool to run your second spark job.

    To achive this you need to have an xml like this with pool configuration sample pool configuration :

    <pool name="default">
        <schedulingMode>FAIR</schedulingMode>
        <weight>3</weight>
        <minShare>3</minShare>
    </pool>
    <pool name="mainpool">
        <schedulingMode>FAIR</schedulingMode>
        <weight>3</weight>
        <minShare>3</minShare>
    </pool>
    <pool name="backlogpool">
        <schedulingMode>FAIR</schedulingMode>
        <weight>3</weight>
        <minShare>3</minShare>
    </pool>
    
    

    Along with that you need to do some more minor changes... in the driver code like which pool first job should go and which pool second job should go.

    How it works :

    enter image description here

    For more details see the articles by me..

    hadoop-yarn-fair-schedular-advantages-explained-part1

    hadoop-yarn-fair-schedular-advantages-explained-part2

    Try these ideas to overcome the waiting. Hope this helps..