I created a very simple "Word count" Java based Spark program, and I am running it in a cluster running on YARN with the below details:
Master Node (NN, SNN, RM) - 192.168.0.100
Slave Nodes (DN, NM) - 192.168.0.105, 192.168.0.108
Master running on : 192.168.0.100
Workers running on : 192.168.0.105, 192.168.0.108
I have created a client machine from where I submit the Spark job (The IP address of client machine is --> 192.168.0.240).
spark-submit --class com.example.WordCountTask --master yarn /root/SparkCodeInJava/word-count/target/word-count-1.0-SNAPSHOT.jar /spark/input/inputText.txt /spark/output
However the program doesn't terminate at all, the data-set is very small (10 text lines) and I expect it to finis without taking much time.
The below is the output I see on console after submitting the Job:
17/03/26 19:54:42 INFO yarn.Client: Application report for application_1490572543329_0001 (state: ACCEPTED)
17/03/26 19:54:43 INFO yarn.Client: Application report for application_1490572543329_0001 (state: ACCEPTED)
17/03/26 19:54:44 INFO yarn.Client: Application report for application_1490572543329_0001 (state: ACCEPTED)
17/03/26 19:54:45 INFO yarn.Client: Application report for application_1490572543329_0001 (state: ACCEPTED)
17/03/26 19:54:46 INFO yarn.Client: Application report for application_1490572543329_0001 (state: ACCEPTED)
And this continues forever. I am not sure why this isn't getting completed.
This is what I see in GUI for this application:
17/03/26 20:24:09 WARN util.NativeCodeLoader: Unable to load native-hadoop libra
/tmp/logs/root/logs/application_1490572543329_0002 does not exist.
Log aggregation has not completed or is not enabled.
This is my first Spark program, and I configured to run it on YARN cluster.
I simulate the distributed environment using 4 VM's , Cent OS running on Virtualbox.
Can anyone help me why this program isn't functioning properly?
I set up the environment in AWS , with two launched instance with good configuration (8 Vcpu's and 32 GB RAM), but the job isn't still getting completed.
<property>
<name>yarn.nodemanager.auxservices</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>ip-XXX-YYYY-ZZZ-AAA.us-west-2.compute.internal:8032</value>
</property>
17/03/29 15:51:35 INFO yarn.Client: Requesting a new application from cluster with **0 NodeManagers**
Has this to do anything with the Job not getting finished?
From the ERROR
messages,
YARN Application State: ACCEPTED, waiting for AM container to be allocated
17/03/29 15:51:35 INFO yarn.Client: Requesting a new application from cluster with **0 NodeManagers**
YARN is unable to allocate containers for the Spark application as there are no active NodeManager(s)
available.
Nodemanagers use the property yarn.resourcemanager.resource-tracker.address
to communicate with ResourceManager
.
By default, the value of this property is set as
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>${yarn.resourcemanager.hostname}:8031</value>
</property>
The reference property yarn.resourcemanager.hostname
defaults to 0.0.0.0
. Nodemanagers will not be able to communicate with the RM unless the RM hostname is defined properly.
Modify this property in yarn-site.xml
for all the nodes
<property>
<name>yarn.resourcemanager.hostname</name>
<value>rm_hostname</value> <!-- Hostname of the node where Resource Manager is started -->
</property>
Also, the property yarn.nodemanager.auxservices
must be yarn.nodemanager.aux-services
.
Restart the services after the changes.