Search code examples
apache-sparkhadooppysparkbitnami

"Error: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher" when running spark-submit or PySpark


I am trying to run the spark-submit command on my Hadoop cluster Here is a summary of my Hadoop Cluster:

  • The cluster is built using 5 VirtualBox VM's connected on an internal network
  • There is 1 namenode and 4 datanodes created.
  • All the VM's were built from the Bitnami Hadoop Stack VirtualBox image

I am trying to run one of the spark examples using the following spark-submit command

spark-submit --class org.apache.spark.examples.SparkPi $SPARK_HOME/examples/jars/spark-examples_2.12-3.0.3.jar 10

I get the following error:

[2022-07-25 13:32:39.253]Container exited with a non-zero exit code 1. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
Error: Could not find or load main class org.apache.spark.deploy.yarn.ExecutorLauncher

I get the same error when trying to run a script with PySpark.

I have tried/verified the following:

  • environment variables: HADOOP_HOME, SPARK_HOME and HADOOP_CONF_DIR have been set in my .bashrc file
  • SPARK_DIST_CLASSPATH and HADOOP_CONF_DIR have been defined in spark-env.sh
  • Added spark.master yarn, spark.yarn.stagingDir hdfs://hadoop-namenode:8020/user/bitnami/sparkStaging and spark.yarn.jars hdfs://hadoop-namenode:8020/user/bitnami/spark/jars/ in spark-defaults.conf
  • I have uploaded the jars into hdfs (i.e. hadoop fs -put $SPARK_HOME/jars/* hdfs://hadoop-namenode:8020/user/bitnami/spark/jars/ )
  • The logs accessible via the web interface (i.e. http://hadoop-namenode:8042 ) do not provide any further details about the error.

Solution

  • I figured out why I was getting this error. It turns out that I made an error while specifying spark.yarn.jars in spark-defaults.conf

    The value of this property must be

    hdfs://hadoop-namenode:8020/user/bitnami/spark/jars/*
    

    instead of

     hdfs://hadoop-namenode:8020/user/bitnami/spark/jars/
    

    i.e. Basically, we need to specify the jar files as the value to this property and not the folder containing the jar files.