Search code examples
apache-sparkmapr

Why ExceptionInInitializerError when submitting Spark application in YARN cluster mode?


I am using spark "Spark 1.6.1-mapr-1604 " version.
My job in local mode executes successfully but when I launch same job in yarn cluster mode it throws ExceptionInInitializerError.

Local mode command:

spark-submit --class com.ts.clustering.TrainModel \
ts-0.0.1-SNAPSHOT.jar \
-model /user/hive/warehouse/ts/clustering_model 
-ip /user/hive/warehouse/ts/aidata_seq/* 
-k 10 -ite 10 > app_2.log &

Yarn cluster mode:

spark-submit --queue dev --master yarn \
--deploy-mode cluster \
--class com.ts.clustering.TrainModel ts-0.0.1-SNAPSHOT.jar \
-model /user/hive/warehouse/ts/clustering_model \
-ip /user/hive/warehouse/ts/aidata_seq/* -k 10 -ite 10 > app_2.log &

-model parameter is the output location for model to be saved.

The exception in cluster mode:

2016-08-29 17:18:46,312 WARN  [task-result-getter-0] scheduler.TaskSetManager: 
  Lost task 0.0 in stage 0.0 (TID 0, ******************): java.lang.ExceptionInInitializerError
        at com.ts.clustering.TrainModel$2.call(TrainModel.java:71)
        at com.ts.clustering.TrainModel$2.call(TrainModel.java:67)
        at org.apache.spark.api.java.JavaPairRDD$$anonfun$toScalaFunction$1.apply(JavaPairRDD.scala:1015)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
        at scala.collection.Iterator$$anon$14.hasNext(Iterator.scala:389)
        at org.apache.spark.storage.MemoryStore.unrollSafely(MemoryStore.scala:284)
Caused by: org.apache.spark.SparkException: A master URL must be set in your configuration
        at org.apache.spark.SparkContext.<init>(SparkContext.scala:401)
        at org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)
        at com.ts.clustering.TrainModel.<clinit>(TrainModel.java:35)
        ... 21 more

I have gone through couple of similar posts but nothing helped. Any suggestion would be great help.


Solution

  • Looking at the stacktrace I could spot the following:

    org.apache.spark.SparkException: A master URL must be set in your configuration
      at org.apache.spark.SparkContext.(SparkContext.scala:401)
      at org.apache.spark.api.java.JavaSparkContext.(JavaSparkContext.scala:59)
      at com.ts.clustering.TrainModel.(TrainModel.java:35)
      ... 21 more
    

    That's easy to say that you did not specify master URL yet you've started the application using --master yarn --deploy-mode cluster.

    I'm guessing that you've hardcoded the master URL to be local or did not specify it at all. See TrainModel.java:35.