Search code examples
javaguavanosuchmethoderrormaven-shade-pluginspark-submit

Apache Spark -- using spark-submit throws a NoSuchMethodError


To submit a Spark application to a cluster, their documentation notes:

To do this, create an assembly jar (or “uber” jar) containing your code and its dependencies. Both sbt and Maven have assembly plugins. When creating assembly jars, list Spark and Hadoop as provided dependencies; these need not be bundled since they are provided by the cluster manager at runtime. -- http://spark.apache.org/docs/latest/submitting-applications.html

So, I added the Apache Maven Shade Plugin to my pom.xml file. (version 3.0.0)
And I turned my Spark dependency's scope into provided. (version 2.1.0)

(I also added the Apache Maven Assembly Plugin to ensure I was wrapping all of my dependencies in the jar when I run mvn clean package. I'm unsure if it's truly necessary.)


Thus is how spark-submit fails. It throws a NoSuchMethodError for a dependency I have (note that the code works from a local instance when compiling inside IntelliJ, assuming that provided is removed).

Exception in thread "main" java.lang.NoSuchMethodError: com.google.common.base.Stopwatch.createStarted()Lcom/google/common/base/Stopwatch;

The line of code that throws the error is irrelevant--it's simply the first line in my main method that creates a Stopwatch, part of the Google Guava utilities. (version 21.0)

Other solutions online suggest that it has to do with version conflicts of Guava, but I haven't had any luck yet with those suggestions. Any help would be appreciated, thank you.


Solution

  • If you take a look at the /jars subdirectory of the Spark 2.1.0 installation, you will likely see guava-14.0.1.jar. Per the API for the Guava Stopwatch#createStarted method you are using, createStarted did not exist until Guava 15.0. What is most likely happening is that the Spark process Classloader is finding the Spark-provided Guava 14.0.1 library before it finds the Guava 21.0 library packaged in your uberjar.

    One possible resolution is to use the class-relocation feature provided by the Maven Shade plugin (which you're already using to construct your uberjar). Via "class relocation", Maven-Shade moves the Guava 21.0 classes (needed by your code) during the packaging of the uberjar from a pattern location reflecting their existing package name (e.g. com.google.common.base) to an arbitrary shadedPattern location, which you specify in the Shade configuration (e.g. myguava123.com.google.common.base).

    The result is that the older and newer Guava libraries no longer share a package name, avoiding the runtime conflict.