I am trying to run multiple yarn applications on EMR Spark, but I am unable to run more than 5 applications at a time.
I am using the following Configurations for Spark Cluster:
Master = r5.2xlarge
Worker = r5.12xlarge 384 GB RAM 48 Virtual Cores deploy mode = cluster
JSON
{
"Classification":"spark-defaults",
"ConfigurationProperties":{
"spark.executor.extraJavaOptions": "-XX:+UseG1GC -XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark -XX:InitiatingHeapOccupancyPercent=35 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:OnOutOfMemoryError='kill -9 %p'",
"spark.driver.extraJavaOptions": "-XX:+UseG1GC -XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark -XX:InitiatingHeapOccupancyPercent=35 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:OnOutOfMemoryError='kill -9 %p'",
"spark.scheduler.mode":"FIFO",
"spark.eventLog.enabled":"true",
"spark.serializer":"org.apache.spark.serializer.KryoSerializer",
"spark.dynamicAllocation.enabled":"false",
"spark.executor.heartbeatInterval":"60s",
"spark.network.timeout": "800s",
"spark.executor.cores": "5",
"spark.driver.cores": "5",
"spark.executor.memory": "37000M",
"spark.driver.memory": "37000M",
"spark.yarn.executor.memoryOverhead": "5000M",
"spark.yarn.driver.memoryOverhead": "5000M",
"spark.executor.instances": "17",
"spark.default.parallelism": "170",
"spark.yarn.scheduler.reporterThread.maxFailures": "5",
"spark.storage.level": "MEMORY_AND_DISK_SER",
"spark.rdd.compress": "true",
"spark.shuffle.compress": "true",
"spark.shuffle.spill.compress": "true"
}
}
How can I increase the number of parallel running Yarn Applications in EMR Spark?
Take a look at the Yarn ui running on the master node of the cluster. Have all of the CPUs and all of the memory been utilized in the cluster? Increasing concurrency usually means that each individual application that is running can only use a small portion of the cluster. Also because you've disabled dynamic executor allocation and set the number of executors to 17 you will likely only be able to run a single spark application at a time.