Search code examples
hadoop-yarnspark-submitspark-shell

resource management on spark jobs on Yarn and spark shell jobs


Our company has a 9 nodes clusters on cloudera.

We have 41 long running spark streaming jobs [YARN + cluster mode] & some regular spark shell jobs scheduled to run on 1pm daily.

All jobs are currently submitted at user A role [ with root permission]

The issue I encountered are that while all 41 spark streaming jobs are running, my scheduled jobs will not be able to obtain resource to run.

I have tried the YARN fair scheduler, but the scheduled jobs remain not running.

We expect the spark streaming jobs are always running, but it will reduce the resources occupied whenever other scheduled jobs start.

please feel free to share your suggestions or possible solutions.


Solution

  • Your spark streaming jobs are consuming too many resources for your scheduled jobs to get started. This is either because they're always scaled to a point that there aren't enough resources left for scheduled jobs or they aren't scaling back.

    For the case where the streaming jobs aren't scaling back you could check whether you have dynamic resource allocation enabled for your streaming jobs. One way of checking is via the spark shell using spark.sparkContext.getConf.get("spark.streaming.dynamicAllocation.enabled"). If dynamic allocation is enabled then you could look at reducing the minimum resources for those jobs.