I am confused as to why Spark only uses a fraction of the Java heap ? Why not just keep it 100% or set it spark.memory.fraction
to 1.0.
What is the point of keeping 0.4 (by default) ? Why keep this memory unutilized ?
Is this used by Spark or used by the JVM ?
From the docs,
spark.memory.fraction expresses the size of M as a fraction of the (JVM heap space - 300MiB) (default 0.6). The rest of the space (40%) is reserved for user data structures, internal metadata in Spark, and safeguarding against OOM errors in the case of sparse and unusually large records.