I am running GraphX on Spark with input file size of around 100GB on aws EMR. My cluster configuration is as follows Nodes - 10 Memory - 122GB each HDD - 320GB each
No matter what I do I'm getting out of memory error when I run spark job as
spark-submit --deploy-mode cluster \
--class com.news.ncg.report.graph.NcgGraphx \
ncgaka-graphx-assembly-1.0.jar true s3://<bkt>/<folder>/run=2016-08-19-02-06-20/part* output
Error
AM Container for appattempt_1474446853388_0001_000001 exited with exitCode: -104
For more detailed output, check application tracking page:http://ip-172-27-111-41.ap-southeast-2.compute.internal:8088/cluster/app/application_1474446853388_0001Then, click on links to logs of each attempt.
Diagnostics: Container [pid=7902,containerID=container_1474446853388_0001_01_000001] is running beyond physical memory limits. Current usage: 1.4 GB of 1.4 GB physical memory used; 3.4 GB of 6.9 GB virtual memory used. Killing container.
Dump of the process-tree for container_1474446853388_0001_01_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 7907 7902 7902 7902 (java) 36828 2081 3522265088 359788 /usr/lib/jvm/java-openjdk/bin/java -server -Xmx1024m -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/application_1474446853388_0001/container_1474446853388_0001_01_000001/tmp -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:MaxHeapFreeRatio=70 -XX:+CMSClassUnloadingEnabled -XX:OnOutOfMemoryError=kill -9 %p -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_1474446853388_0001/container_1474446853388_0001_01_000001 org.apache.spark.deploy.yarn.ApplicationMaster --class com.news.ncg.report.graph.NcgGraphx --jar s3://discover-pixeltoucher/jar/ncgaka-graphx-assembly-1.0.jar --arg true --arg s3://discover-pixeltoucher/ncgus/run=2016-08-19-02-06-20/part* --arg s3://discover-pixeltoucher/output/20160819/ --properties-file /mnt/yarn/usercache/hadoop/appcache/application_1474446853388_0001/container_1474446853388_0001_01_000001/__spark_conf__/__spark_conf__.properties
|- 7902 7900 7902 7902 (bash) 0 0 115810304 687 /bin/bash -c LD_LIBRARY_PATH=/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:::/usr/lib/hadoop-lzo/lib/native:/usr/lib/hadoop/lib/native::/usr/lib/hadoop-lzo/lib/native:/usr/lib/hadoop/lib/native:/usr/lib/hadoop-lzo/lib/native:/usr/lib/hadoop/lib/native /usr/lib/jvm/java-openjdk/bin/java -server -Xmx1024m -Djava.io.tmpdir=/mnt/yarn/usercache/hadoop/appcache/application_1474446853388_0001/container_1474446853388_0001_01_000001/tmp '-XX:+UseConcMarkSweepGC' '-XX:CMSInitiatingOccupancyFraction=70' '-XX:MaxHeapFreeRatio=70' '-XX:+CMSClassUnloadingEnabled' '-XX:OnOutOfMemoryError=kill -9 %p' -Dspark.yarn.app.container.log.dir=/var/log/hadoop-yarn/containers/application_1474446853388_0001/container_1474446853388_0001_01_000001 org.apache.spark.deploy.yarn.ApplicationMaster --class 'com.news.ncg.report.graph.NcgGraphx' --jar s3://discover-pixeltoucher/jar/ncgaka-graphx-assembly-1.0.jar --arg 'true' --arg 's3://discover-pixeltoucher/ncgus/run=2016-08-19-02-06-20/part*' --arg 's3://discover-pixeltoucher/output/20160819/' --properties-file /mnt/yarn/usercache/hadoop/appcache/application_1474446853388_0001/container_1474446853388_0001_01_000001/__spark_conf__/__spark_conf__.properties 1> /var/log/hadoop-yarn/containers/application_1474446853388_0001/container_1474446853388_0001_01_000001/stdout 2> /var/log/hadoop-yarn/containers/application_1474446853388_0001/container_1474446853388_0001_01_000001/stderr
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
Failing this attempt
Any idea how can I stop getting this error?
I created sparkSession as below
val spark = SparkSession
.builder()
.master(mode)
.config("spark.hadoop.validateOutputSpecs", "false")
.config("spark.driver.cores", "1")
.config("spark.driver.memory", "30g")
.config("spark.executor.memory", "19g")
.config("spark.executor.cores", "5")
.config("spark.yarn.executor.memoryOverhead","2g")
.config("spark.yarn.driver.memoryOverhead ","1g")
.config("spark.shuffle.compress","true")
.config("spark.shuffle.service.enabled","true")
.config("spark.scheduler.mode","FAIR")
.config("spark.speculation","true")
.appName("NcgGraphX")
.getOrCreate()
It seems like you want to deploy your Spark application on YARN. If that is the case, you should not set up application properties in code, but rather using spark-submit:
$ ./bin/spark-submit --class com.news.ncg.report.graph.NcgGraphx \
--master yarn \
--deploy-mode cluster \
--driver-memory 30g \
--executor-memory 19g \
--executor-cores 5 \
<other options>
ncgaka-graphx-assembly-1.0.jar true s3://<bkt>/<folder>/run=2016-08-19-02-06-20/part* output
In client
mode, the JVM would have been already set up, so I would personally use CLI to pass those options.
After passing memory options in spark-submit
change your code to load variables dynamically: SparkSession.builder().getOrCreate()
PS. You might also want to increase memory for AM in spark.yarn.am.memory
property.