Search code examples
hadoopapache-sparkmapreduceguava

IllegalAccessError to guava's StopWatch from org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus


I'm trying to run small spark application and am getting the following exception:

Exception in thread "main" java.lang.IllegalAccessError: tried to access method com.google.common.base.Stopwatch.<init>()V from class org.apache.hadoop.mapreduce.lib.input.FileInputFormat
    at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:262)
    at org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:217)
    at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:95)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
    at scala.Option.getOrElse(Option.scala:120)
    at org.apache.spark.rdd.RDD.partitions(RDD.scala:217)
    at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:32)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:219)
    at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:217)
    at scala.Option.getOrElse(Option.scala:120)

the relevant gradle dependencies section:

compile('org.apache.spark:spark-core_2.10:1.3.1')
compile('org.apache.hadoop:hadoop-mapreduce-client-core:2.6.2') {force = true}
compile('org.apache.hadoop:hadoop-mapreduce-client-app:2.6.2') {force = true}
compile('org.apache.hadoop:hadoop-mapreduce-client-shuffle:2.6.2') {force = true}
compile('com.google.guava:guava:19.0') { force = true }

Solution

  • version 2.6.2 of hadoop:hadoop-mapreduce-client-core can't be used together with guava's new versions (I tried 17.0 - 19.0) since guava's StopWatch constructor can't be accessed (causing above IllegalAccessError)

    using hadoop-mapreduce-client-core's latest version - 2.7.2 (in which they don't use guava's StopWatch in the above method, rather they use org.apache.hadoop.util.StopWatch) solved the problem, with two additional dependencies that were required:

    compile('org.apache.hadoop:hadoop-mapreduce-client-core:2.7.2') {force = true}
    
    compile('org.apache.hadoop:hadoop-common:2.7.2') {force = true} // required for org.apache.hadoop.util.StopWatch  
    
    compile('commons-io:commons-io:2.4') {force = true} // required for org.apache.commons.io.Charsets that is used internally
    

    note: there are two org.apache.commons.io packages: commons-io:commons-io (ours here), and org.apache.commons:commons-io (old one, 2007). make sure to include the correct one.