I am trying to execute a spark application on some data on AWS. I was able process whole data using 20 m4.large machines on AWS. Now, I tried the same using c4.8xlarge machines but got the following error:
AM Container for appattempt_1570270970620_0001_000001 exited with exitCode: -104
Failing this attempt.Diagnostics: Container [pid=12140,containerID=container_1570270970620_0001_01_000001] is running beyond physical memory limits. Current usage: 1.4 GB of 1.4 GB physical memory used; 3.5 GB of 6.9 GB virtual memory used. Killing container.
{...}
Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143
The command I used to run the cluster was:
spark-submit --deploy-mode cluster --class xyzpackage.xyzclass --master yarn --jar s3://path/xyz_2.11-1.0.jar --arg s3://path_to_files/xy.csv --arg s3://output_path/newfile
when the application starts I see this info:
19/10/05 10:42:04 INFO RMProxy: Connecting to ResourceManager at ip-172-31-30-66.us-east-2.compute.internal/172.31.30.66:8032
19/10/05 10:42:04 INFO Client: Requesting a new application from cluster with 20 NodeManagers
19/10/05 10:42:04 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (53248 MB per container)
19/10/05 10:42:04 **INFO Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead**
19/10/05 10:42:04 INFO Client: Setting up container launch context for our AM
19/10/05 10:42:04 INFO Client: Setting up the launch environment for our AM container
19/10/05 10:42:04 INFO Client: Preparing resources for our AM container
The AM container has 1408MB~1.4GB memory allocated due to which I am getting the error. How do I increase the AM container memory? I tried this but no luck:
spark-submit --deploy-mode cluster --class xyzpackage.xyzclass --master yarn --conf spark.yarn.executor.memoryOverhead=8000 --driver-memory=91G --jar s3://path/xyz_2.11-1.0.jar --arg s3://path_to_files/* --arg s3://output_path/newfile
How to edit this command to increase the AM container size?
I figured out the mistake in my command. The configuration command to change executor memory and overhead memory is :
spark-submit --deploy-mode cluster --class xyzpackage.xyzclass --master yarn --conf spark.driver.memoryOverhead=2048 --conf spark.executor.memoryOverhead=2048--jar s3://path/xyz_2.11-1.0.jar --arg s3://path_to_files/* --arg s3://output_path/newfile
But I guess altering executor memory directly using spark.executor/driver.memory is better option and letting memoryOverhead to be 10% of executor/driver memory.