Search code examples
pysparkhadoop-yarnamazon-emr

How to increase yarn application parallelism


I am trying to run multiple yarn applications on EMR Spark, but I am unable to run more than 5 applications at a time.

I am using the following Configurations for Spark Cluster:

Master = r5.2xlarge

Worker = r5.12xlarge 384 GB RAM 48 Virtual Cores deploy mode = cluster

JSON

{
        "Classification":"spark-defaults",
        "ConfigurationProperties":{
          "spark.executor.extraJavaOptions": "-XX:+UseG1GC -XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark -XX:InitiatingHeapOccupancyPercent=35 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:OnOutOfMemoryError='kill -9 %p'",
          "spark.driver.extraJavaOptions": "-XX:+UseG1GC -XX:+UnlockDiagnosticVMOptions -XX:+G1SummarizeConcMark -XX:InitiatingHeapOccupancyPercent=35 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:OnOutOfMemoryError='kill -9 %p'",
          "spark.scheduler.mode":"FIFO",
          "spark.eventLog.enabled":"true",
          "spark.serializer":"org.apache.spark.serializer.KryoSerializer",
          "spark.dynamicAllocation.enabled":"false",
          "spark.executor.heartbeatInterval":"60s",
          "spark.network.timeout": "800s",
          "spark.executor.cores": "5",
          "spark.driver.cores": "5",
          "spark.executor.memory": "37000M",
          "spark.driver.memory": "37000M",
          "spark.yarn.executor.memoryOverhead": "5000M",
          "spark.yarn.driver.memoryOverhead": "5000M",
          "spark.executor.instances": "17",
          "spark.default.parallelism": "170",
          "spark.yarn.scheduler.reporterThread.maxFailures": "5",
          "spark.storage.level": "MEMORY_AND_DISK_SER",
          "spark.rdd.compress": "true",
          "spark.shuffle.compress": "true",
          "spark.shuffle.spill.compress": "true"
        }
      }

How can I increase the number of parallel running Yarn Applications in EMR Spark?


Solution

  • Take a look at the Yarn ui running on the master node of the cluster. Have all of the CPUs and all of the memory been utilized in the cluster? Increasing concurrency usually means that each individual application that is running can only use a small portion of the cluster. Also because you've disabled dynamic executor allocation and set the number of executors to 17 you will likely only be able to run a single spark application at a time.