unix apache-spark hadoop-yarn ram virtual-memory

Spark on Yarn memory(physical+virtual) usage

I am struggling to understand how memory management works with Spark on Yarn:

My spark-submit has

--executor-memory 48g
--num-executors 2

When I run top -p <pids_of_2_yarn_containers/executors>

VIRT    RES   %MEM
51.059g 0.015t ~4    (container 1)
51.039g 0.012t ~3    (container 2)

The total memory of the system is 380g.

And finally, on YARN when I click on each of the containers page I can see:

Resource: 54272 Memory (container 1)
Resource: 54272 Memory (container 2)

Why each of the above metrics do not add up? I am requesting 48g on each spark executor, however YARN shows 54g, OS reports 15gb physical memory used(RES column in top) and 51g virtual memory used(VIRT column).

Solution

In yarn-site.xml

yarn.scheduler.minimum-allocation-mb (This value changes based on cluster ram capacity) - The minimum allocation for every container request at the RM, in MBs. Memory requests lower than this won't take effect, and the specified value will get allocated at minimum and similarly Max container size

yarn.scheduler.maximum-allocation-mb (This value changes based on cluster ram capacity) - The maximum allocation for every container request at the RM, in MBs. Memory requests higher than this won't take effect, and will get capped to this value

yarn.nodemanager.resource.memory-mb - Amount of physical memory, in MB, that can be allocated for containers.

yarn.nodemanager.vmem-pmem-ratio - The virtual memory (physical + paged memory) upper limit for each Map and Reduce task is determined by the virtual memory ratio each YARN Container is allowed. This is set by the following configuration, and the default value is 2.1

yarn.nodemanager.resource.cpu-vcores - This property controls the maximum sum of cores used by the containers on each node.

In mapred-site.xml

mapreduce.map.memory.mb - Maximum memory each map task will use.

mapreduce.reduce.memory.mb - Maximum memory each reduce task will use.

mapreduce.map.java.opts - JVM heap size for map task.

mapreduce.reduce.java.opts - JVM heap size for map task.

Spark settings

The --executor-memory/spark.executor.memory controls the executor heap size, but JVMs can also use some memory off heap, for example for interned Strings and direct byte buffers. The value of the spark.yarn.executor.memoryOverhead property is added to the executor memory to determine the full memory request to YARN for each executor. It defaults to max(384, .07 * spark.executor.memory)

The --num-executors command-line flag or spark.executor.instances configuration property control the number of executors requested

So can you specify the values for all these parameters mentioned above. It will help to calculate the memory allocations in your case.