I am having issues submitting a spark-submit remote job from a machine outside from the Spark Cluster running on YARN.
Exception in thread "main" java.net.ConnectionException: Call from remote.dev.local/192.168.10.65 to target.dev.local:8020 failed on connection exception: java.net.ConnectionException: Connection Refused
In my core-site.xml:
<property>
<name>fs.defaultFS</name>
<value>hdfs://target.dev.local:8020</value>
<property>
Also at my hdfs-site.xml in the cluster I have disbled permissions checking for HDFS:
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
<property>
Also, when I telnet from the machine outside the cluster:
telnet target.dev.local 8020
I am getting
telnet: connect to address 192.168.10.186: Connection Refused
But, when I
telnet target.dev.local 9000
it says Connected.
Also when I ping target.dev.local
it works.
My spark-submit script from the remote machine is:
export HADOOP_CONF_DIR=/<path_to_conf_dir_copied_from_cluster>/
spark-submit --class org.apache.spark.examples.SparkPi \
--master yarn \
--deploy-mode cluster \
--driver-memory 5g \
--executor-memory 50g \
--executor-cores 5 \
--queue default \
<path to jar>.jar \
10
What am I missing here?
Turns out I had to change
<property>
<name>fs.defaultFS</name>
<value>hdfs://target.dev.local:8020</value>
<property>
to
<property>
<name>fs.defaultFS</name>
<value>hdfs://0.0.0.0:8020</value>
<property>
to allow connections form the outside since target.dev.local sits in a private network switch.