Search code examples
apache-sparkhadoopclouderahadoop2

Does memory configuration really matter with fair scheduler?


We have a hadoop cluster with fair scheduler configured. We used to see the scenario whan there were not many jobs in the cluster to run, the running job was trying to take as much as memory and cores available.

With the Fair scheduler does executor memory and cores are really matter for the spark Jobs? Or does it depend upon the fair scheduler to decide how much to give?


Solution

  • It's the policy of Fair Scheduler that the first job assigned to it will have all the resources provided.

    When we run the second job, all the resources will be divided in to (available resources)/(no. of jobs)

    Now the main thing to focus is, how much maximum number of container memory you have given to run the job. If it is equal to the total number of resources available then it's genuine for your job to use all the resources.