Search code examples
scalahadoopamazon-emrscalding

Scalding on EMR: Hadoop job fails with NoSuchMethodError: scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object;


Basically I need to run a scalding job on EMR. The same job runs perfectly fine on local hadoop on my macbook, but fails on Hadoop on EMR.

I am trying hard to get help for this issue in the cascading-user and scala-user groups as well, and haven't been able to. So far I haven't made much progress after trying various changes in the past couple days.

Here is the error before I delve into the details:

Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object; at com.aggregation.job.DataAggregation$.(DataAggregation.scala:30) at com.aggregation.job.DataAggregation$.(DataAggregation.scala) at com.aggregation.job.DataAggregation.main(DataAggregation.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

They said it might be a scala conflict with the binaries, but I couldn't see anything obvious. It would be great if someone could help with figuring it out.

Environment:

Amazon EMR AMI: 3.8.0 (which includes Scala 2.11.1, Hadoop 2.4.0, Java 1.7.0_76 - AMI details here)

Application environment: Scalding 0.15.0, Scala 2.11.1, Java 1.7.0_80, Hadoop 2.4.0

I have tried various changes to configuration and even manually installing a higher scala version in EMR, but so far the error is not going away.

Please help! Thank you.

L

Setup

build.sbt:

scalaVersion := "2.11.1"

ivyScala := ivyScala.value map {
  _.copy(overrideScalaVersion = true)
}

dependencies.sbt:

import AssemblyKeys._

val hadoopVersion = "2.4.0"

val scaldingVersion = "0.15.0"

libraryDependencies ++= Seq(
  "com.twitter" %% "scalding-core" % scaldingVersion,
  "com.twitter" %% "scalding-json" % scaldingVersion,
  "com.twitter" %% "scalding-jdbc" % scaldingVersion,
  "com.github.nscala-time" %% "nscala-time" % "2.0.0",
  "org.apache.hadoop" % "hadoop-common" % hadoopVersion % "provided",
  "org.apache.hadoop" % "hadoop-mapreduce-client-core" % hadoopVersion % "provided"
)

excludedJars in assembly <<= (fullClasspath in assembly) map { cp =>
  val excludes = Set(
  "jsp-api-2.1-6.1.14.jar",
  "jsp-2.1-6.1.14.jar",
  "jasper-compiler-5.5.12.jar",
  "minlog-1.2.jar", // Otherwise causes conflicts with Kyro (which bundles it)
  "janino-2.5.16.jar", // Janino includes a broken signature, and is not needed anyway
  "commons-beanutils-core-1.8.0.jar", // Clash with each other and with commons-collections
  "commons-beanutils-1.7.0.jar", // "
  "hadoop-core-1.2.1.jar", // Provided by Amazon EMR. Delete this line if you're not on EMR
  "hadoop-tools-1.2.1.jar" // "
)
cp filter { jar => excludes(jar.data.getName) }
}

resolvers ++= Seq(
  "Conjars repo" at "http://conjars.org/repo"
)

assembly.sbt

import AssemblyKeys._

assemblySettings

mergeStrategy in assembly := Merge.mergeStrategy

project/build.properties:

sbt.version=0.13.1

project/assembly.sbt:

addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.11.0")

Finally this is the dependency tree using the sbt-dependency-graph plugin to see if I've got the right versions. Sorry, it's quite long, if I should be displaying this info any other way, please recommend suggestions.

sbt dependency tree:

[info] com.abc.aggregator_2.11:0.1-20150628T184441 [S]
[info]   +-com.github.nscala-time:nscala-time_2.11:2.0.0 [S]
[info]   | +-joda-time:joda-time:2.7
[info]   | +-org.joda:joda-convert:1.2
[info]   | 
[info]   +-com.twitter:scalding-core_2.11:0.15.0 [S]
[info]   | +-cascading:cascading-core:2.6.1
[info]   | | +-org.codehaus.janino:janino:2.7.5
[info]   | | | +-org.codehaus.janino:commons-compiler:2.7.5
[info]   | | | 
[info]   | | +-riffle:riffle:0.1-dev
[info]   | | +-thirdparty:jgrapht-jdk1.6:0.8.1
[info]   | | 
[info]   | +-cascading:cascading-hadoop:2.6.1
[info]   | | +-cascading:cascading-core:2.6.1
[info]   | |   +-org.codehaus.janino:janino:2.7.5
[info]   | |   | +-org.codehaus.janino:commons-compiler:2.7.5
[info]   | |   | 
[info]   | |   +-riffle:riffle:0.1-dev
[info]   | |   +-thirdparty:jgrapht-jdk1.6:0.8.1
[info]   | |   
[info]   | +-cascading:cascading-local:2.6.1
[info]   | | +-cascading:cascading-core:2.6.1
[info]   | | | +-org.codehaus.janino:janino:2.7.5
[info]   | | | | +-org.codehaus.janino:commons-compiler:2.7.5
[info]   | | | | 
[info]   | | | +-riffle:riffle:0.1-dev
[info]   | | | +-thirdparty:jgrapht-jdk1.6:0.8.1
[info]   | | | 
[info]   | | +-com.google.guava:guava:14.0.1 (evicted by: 15.0)
[info]   | | +-com.google.guava:guava:15.0
[info]   | | +-org.slf4j:slf4j-api:1.6.6
[info]   | | +-org.slf4j:slf4j-api:1.7.2 (evicted by: 1.6.6)
[info]   | | 
[info]   | +-com.twitter:algebird-core_2.11:0.10.1 [S]
[info]   | | +-com.googlecode.javaewah:JavaEWAH:0.6.6
[info]   | | 
[info]   | +-com.twitter:bijection-core_2.11:0.8.0 [S]
[info]   | +-com.twitter:chill-algebird_2.11:0.6.0 [S]
[info]   | | +-com.esotericsoftware.kryo:kryo:2.21
[info]   | | | +-com.esotericsoftware.minlog:minlog:1.2
[info]   | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]   | | | | +-org.ow2.asm:asm:4.0
[info]   | | | | 
[info]   | | | +-org.objenesis:objenesis:1.2
[info]   | | | 
[info]   | | +-com.twitter:algebird-core_2.11:0.10.0 (evicted by: 0.10.1)
[info]   | | +-com.twitter:algebird-core_2.11:0.10.1 [S]
[info]   | | | +-com.googlecode.javaewah:JavaEWAH:0.6.6
[info]   | | | 
[info]   | | +-com.twitter:chill_2.11:0.6.0 [S]
[info]   | |   +-com.esotericsoftware.kryo:kryo:2.21
[info]   | |   | +-com.esotericsoftware.minlog:minlog:1.2
[info]   | |   | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]   | |   | | +-org.ow2.asm:asm:4.0
[info]   | |   | | 
[info]   | |   | +-org.objenesis:objenesis:1.2
[info]   | |   | 
[info]   | |   +-com.twitter:chill-java:0.6.0
[info]   | |     +-com.esotericsoftware.kryo:kryo:2.21
[info]   | |       +-com.esotericsoftware.minlog:minlog:1.2
[info]   | |       +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]   | |       | +-org.ow2.asm:asm:4.0
[info]   | |       | 
[info]   | |       +-org.objenesis:objenesis:1.2
[info]   | |       
[info]   | +-com.twitter:chill-hadoop:0.6.0
[info]   | | +-com.esotericsoftware.kryo:kryo:2.21
[info]   | | | +-com.esotericsoftware.minlog:minlog:1.2
[info]   | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]   | | | | +-org.ow2.asm:asm:4.0
[info]   | | | | 
[info]   | | | +-org.objenesis:objenesis:1.2
[info]   | | | 
[info]   | | +-com.twitter:chill-java:0.6.0
[info]   | | | +-com.esotericsoftware.kryo:kryo:2.21
[info]   | | |   +-com.esotericsoftware.minlog:minlog:1.2
[info]   | | |   +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]   | | |   | +-org.ow2.asm:asm:4.0
[info]   | | |   | 
[info]   | | |   +-org.objenesis:objenesis:1.2
[info]   | | |   
[info]   | | +-org.slf4j:slf4j-api:1.6.6
[info]   | | 
[info]   | +-com.twitter:chill-java:0.6.0
[info]   | | +-com.esotericsoftware.kryo:kryo:2.21
[info]   | |   +-com.esotericsoftware.minlog:minlog:1.2
[info]   | |   +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]   | |   | +-org.ow2.asm:asm:4.0
[info]   | |   | 
[info]   | |   +-org.objenesis:objenesis:1.2
[info]   | |   
[info]   | +-com.twitter:chill_2.11:0.6.0 [S]
[info]   | | +-com.esotericsoftware.kryo:kryo:2.21
[info]   | | | +-com.esotericsoftware.minlog:minlog:1.2
[info]   | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]   | | | | +-org.ow2.asm:asm:4.0
[info]   | | | | 
[info]   | | | +-org.objenesis:objenesis:1.2
[info]   | | | 
[info]   | | +-com.twitter:chill-java:0.6.0
[info]   | |   +-com.esotericsoftware.kryo:kryo:2.21
[info]   | |     +-com.esotericsoftware.minlog:minlog:1.2
[info]   | |     +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]   | |     | +-org.ow2.asm:asm:4.0
[info]   | |     | 
[info]   | |     +-org.objenesis:objenesis:1.2
[info]   | |     
[info]   | +-com.twitter:maple:0.15.0
[info]   | | +-cascading:cascading-hadoop:2.6.1
[info]   | |   +-cascading:cascading-core:2.6.1
[info]   | |     +-org.codehaus.janino:janino:2.7.5
[info]   | |     | +-org.codehaus.janino:commons-compiler:2.7.5
[info]   | |     | 
[info]   | |     +-riffle:riffle:0.1-dev
[info]   | |     +-thirdparty:jgrapht-jdk1.6:0.8.1
[info]   | |     
[info]   | +-com.twitter:scalding-args_2.11:0.15.0 [S]
[info]   | +-com.twitter:scalding-date_2.11:0.15.0 [S]
[info]   | +-com.twitter:scalding-serialization_2.11:0.15.0 [S]
[info]   | +-org.slf4j:slf4j-api:1.6.6
[info]   | 
[info]   +-com.twitter:scalding-jdbc_2.11:0.15.0 [S]
[info]   | +-cascading:cascading-jdbc-core:2.6.0
[info]   | | +-com.google.guava:guava:15.0
[info]   | | 
[info]   | +-cascading:cascading-jdbc-mysql:2.6.0
[info]   | | +-cascading:cascading-jdbc-core:2.6.0
[info]   | | | +-com.google.guava:guava:15.0
[info]   | | | 
[info]   | | +-com.google.guava:guava:15.0
[info]   | | +-mysql:mysql-connector-java:5.1.25
[info]   | | 
[info]   | +-com.twitter:scalding-core_2.11:0.15.0 [S]
[info]   |   +-cascading:cascading-core:2.6.1
[info]   |   | +-org.codehaus.janino:janino:2.7.5
[info]   |   | | +-org.codehaus.janino:commons-compiler:2.7.5
[info]   |   | | 
[info]   |   | +-riffle:riffle:0.1-dev
[info]   |   | +-thirdparty:jgrapht-jdk1.6:0.8.1
[info]   |   | 
[info]   |   +-cascading:cascading-hadoop:2.6.1
[info]   |   | +-cascading:cascading-core:2.6.1
[info]   |   |   +-org.codehaus.janino:janino:2.7.5
[info]   |   |   | +-org.codehaus.janino:commons-compiler:2.7.5
[info]   |   |   | 
[info]   |   |   +-riffle:riffle:0.1-dev
[info]   |   |   +-thirdparty:jgrapht-jdk1.6:0.8.1
[info]   |   |   
[info]   |   +-cascading:cascading-local:2.6.1
[info]   |   | +-cascading:cascading-core:2.6.1
[info]   |   | | +-org.codehaus.janino:janino:2.7.5
[info]   |   | | | +-org.codehaus.janino:commons-compiler:2.7.5
[info]   |   | | | 
[info]   |   | | +-riffle:riffle:0.1-dev
[info]   |   | | +-thirdparty:jgrapht-jdk1.6:0.8.1
[info]   |   | | 
[info]   |   | +-com.google.guava:guava:14.0.1 (evicted by: 15.0)
[info]   |   | +-com.google.guava:guava:15.0
[info]   |   | +-org.slf4j:slf4j-api:1.6.6
[info]   |   | +-org.slf4j:slf4j-api:1.7.2 (evicted by: 1.6.6)
[info]   |   | 
[info]   |   +-com.twitter:algebird-core_2.11:0.10.1 [S]
[info]   |   | +-com.googlecode.javaewah:JavaEWAH:0.6.6
[info]   |   | 
[info]   |   +-com.twitter:bijection-core_2.11:0.8.0 [S]
[info]   |   +-com.twitter:chill-algebird_2.11:0.6.0 [S]
[info]   |   | +-com.esotericsoftware.kryo:kryo:2.21
[info]   |   | | +-com.esotericsoftware.minlog:minlog:1.2
[info]   |   | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]   |   | | | +-org.ow2.asm:asm:4.0
[info]   |   | | | 
[info]   |   | | +-org.objenesis:objenesis:1.2
[info]   |   | | 
[info]   |   | +-com.twitter:algebird-core_2.11:0.10.0 (evicted by: 0.10.1)
[info]   |   | +-com.twitter:algebird-core_2.11:0.10.1 [S]
[info]   |   | | +-com.googlecode.javaewah:JavaEWAH:0.6.6
[info]   |   | | 
[info]   |   | +-com.twitter:chill_2.11:0.6.0 [S]
[info]   |   |   +-com.esotericsoftware.kryo:kryo:2.21
[info]   |   |   | +-com.esotericsoftware.minlog:minlog:1.2
[info]   |   |   | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]   |   |   | | +-org.ow2.asm:asm:4.0
[info]   |   |   | | 
[info]   |   |   | +-org.objenesis:objenesis:1.2
[info]   |   |   | 
[info]   |   |   +-com.twitter:chill-java:0.6.0
[info]   |   |     +-com.esotericsoftware.kryo:kryo:2.21
[info]   |   |       +-com.esotericsoftware.minlog:minlog:1.2
[info]   |   |       +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]   |   |       | +-org.ow2.asm:asm:4.0
[info]   |   |       | 
[info]   |   |       +-org.objenesis:objenesis:1.2
[info]   |   |       
[info]   |   +-com.twitter:chill-hadoop:0.6.0
[info]   |   | +-com.esotericsoftware.kryo:kryo:2.21
[info]   |   | | +-com.esotericsoftware.minlog:minlog:1.2
[info]   |   | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]   |   | | | +-org.ow2.asm:asm:4.0
[info]   |   | | | 
[info]   |   | | +-org.objenesis:objenesis:1.2
[info]   |   | | 
[info]   |   | +-com.twitter:chill-java:0.6.0
[info]   |   | | +-com.esotericsoftware.kryo:kryo:2.21
[info]   |   | |   +-com.esotericsoftware.minlog:minlog:1.2
[info]   |   | |   +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]   |   | |   | +-org.ow2.asm:asm:4.0
[info]   |   | |   | 
[info]   |   | |   +-org.objenesis:objenesis:1.2
[info]   |   | |   
[info]   |   | +-org.slf4j:slf4j-api:1.6.6
[info]   |   | 
[info]   |   +-com.twitter:chill-java:0.6.0
[info]   |   | +-com.esotericsoftware.kryo:kryo:2.21
[info]   |   |   +-com.esotericsoftware.minlog:minlog:1.2
[info]   |   |   +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]   |   |   | +-org.ow2.asm:asm:4.0
[info]   |   |   | 
[info]   |   |   +-org.objenesis:objenesis:1.2
[info]   |   |   
[info]   |   +-com.twitter:chill_2.11:0.6.0 [S]
[info]   |   | +-com.esotericsoftware.kryo:kryo:2.21
[info]   |   | | +-com.esotericsoftware.minlog:minlog:1.2
[info]   |   | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]   |   | | | +-org.ow2.asm:asm:4.0
[info]   |   | | | 
[info]   |   | | +-org.objenesis:objenesis:1.2
[info]   |   | | 
[info]   |   | +-com.twitter:chill-java:0.6.0
[info]   |   |   +-com.esotericsoftware.kryo:kryo:2.21
[info]   |   |     +-com.esotericsoftware.minlog:minlog:1.2
[info]   |   |     +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]   |   |     | +-org.ow2.asm:asm:4.0
[info]   |   |     | 
[info]   |   |     +-org.objenesis:objenesis:1.2
[info]   |   |     
[info]   |   +-com.twitter:maple:0.15.0
[info]   |   | +-cascading:cascading-hadoop:2.6.1
[info]   |   |   +-cascading:cascading-core:2.6.1
[info]   |   |     +-org.codehaus.janino:janino:2.7.5
[info]   |   |     | +-org.codehaus.janino:commons-compiler:2.7.5
[info]   |   |     | 
[info]   |   |     +-riffle:riffle:0.1-dev
[info]   |   |     +-thirdparty:jgrapht-jdk1.6:0.8.1
[info]   |   |     
[info]   |   +-com.twitter:scalding-args_2.11:0.15.0 [S]
[info]   |   +-com.twitter:scalding-date_2.11:0.15.0 [S]
[info]   |   +-com.twitter:scalding-serialization_2.11:0.15.0 [S]
[info]   |   +-org.slf4j:slf4j-api:1.6.6
[info]   |   
[info]   +-com.twitter:scalding-json_2.11:0.15.0 [S]
[info]     +-com.fasterxml.jackson.module:jackson-module-scala_2.11:2.4.2 [S]
[info]     | +-com.fasterxml.jackson.core:jackson-annotations:2.4.2
[info]     | +-com.fasterxml.jackson.core:jackson-core:2.4.2
[info]     | +-com.fasterxml.jackson.core:jackson-databind:2.4.2
[info]     | | +-com.fasterxml.jackson.core:jackson-annotations:2.4.0 (evicted by: 2.4.2)
[info]     | | +-com.fasterxml.jackson.core:jackson-annotations:2.4.2
[info]     | | +-com.fasterxml.jackson.core:jackson-core:2.4.2
[info]     | | 
[info]     | +-com.google.code.findbugs:jsr305:2.0.1
[info]     | +-com.google.guava:guava:15.0
[info]     | +-com.thoughtworks.paranamer:paranamer:2.6
[info]     | +-org.scala-lang:scala-reflect:2.11.2 [S]
[info]     | 
[info]     +-com.twitter:scalding-core_2.11:0.15.0 [S]
[info]     | +-cascading:cascading-core:2.6.1
[info]     | | +-org.codehaus.janino:janino:2.7.5
[info]     | | | +-org.codehaus.janino:commons-compiler:2.7.5
[info]     | | | 
[info]     | | +-riffle:riffle:0.1-dev
[info]     | | +-thirdparty:jgrapht-jdk1.6:0.8.1
[info]     | | 
[info]     | +-cascading:cascading-hadoop:2.6.1
[info]     | | +-cascading:cascading-core:2.6.1
[info]     | |   +-org.codehaus.janino:janino:2.7.5
[info]     | |   | +-org.codehaus.janino:commons-compiler:2.7.5
[info]     | |   | 
[info]     | |   +-riffle:riffle:0.1-dev
[info]     | |   +-thirdparty:jgrapht-jdk1.6:0.8.1
[info]     | |   
[info]     | +-cascading:cascading-local:2.6.1
[info]     | | +-cascading:cascading-core:2.6.1
[info]     | | | +-org.codehaus.janino:janino:2.7.5
[info]     | | | | +-org.codehaus.janino:commons-compiler:2.7.5
[info]     | | | | 
[info]     | | | +-riffle:riffle:0.1-dev
[info]     | | | +-thirdparty:jgrapht-jdk1.6:0.8.1
[info]     | | | 
[info]     | | +-com.google.guava:guava:14.0.1 (evicted by: 15.0)
[info]     | | +-com.google.guava:guava:15.0
[info]     | | +-org.slf4j:slf4j-api:1.6.6
[info]     | | +-org.slf4j:slf4j-api:1.7.2 (evicted by: 1.6.6)
[info]     | | 
[info]     | +-com.twitter:algebird-core_2.11:0.10.1 [S]
[info]     | | +-com.googlecode.javaewah:JavaEWAH:0.6.6
[info]     | | 
[info]     | +-com.twitter:bijection-core_2.11:0.8.0 [S]
[info]     | +-com.twitter:chill-algebird_2.11:0.6.0 [S]
[info]     | | +-com.esotericsoftware.kryo:kryo:2.21
[info]     | | | +-com.esotericsoftware.minlog:minlog:1.2
[info]     | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]     | | | | +-org.ow2.asm:asm:4.0
[info]     | | | | 
[info]     | | | +-org.objenesis:objenesis:1.2
[info]     | | | 
[info]     | | +-com.twitter:algebird-core_2.11:0.10.0 (evicted by: 0.10.1)
[info]     | | +-com.twitter:algebird-core_2.11:0.10.1 [S]
[info]     | | | +-com.googlecode.javaewah:JavaEWAH:0.6.6
[info]     | | | 
[info]     | | +-com.twitter:chill_2.11:0.6.0 [S]
[info]     | |   +-com.esotericsoftware.kryo:kryo:2.21
[info]     | |   | +-com.esotericsoftware.minlog:minlog:1.2
[info]     | |   | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]     | |   | | +-org.ow2.asm:asm:4.0
[info]     | |   | | 
[info]     | |   | +-org.objenesis:objenesis:1.2
[info]     | |   | 
[info]     | |   +-com.twitter:chill-java:0.6.0
[info]     | |     +-com.esotericsoftware.kryo:kryo:2.21
[info]     | |       +-com.esotericsoftware.minlog:minlog:1.2
[info]     | |       +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]     | |       | +-org.ow2.asm:asm:4.0
[info]     | |       | 
[info]     | |       +-org.objenesis:objenesis:1.2
[info]     | |       
[info]     | +-com.twitter:chill-hadoop:0.6.0
[info]     | | +-com.esotericsoftware.kryo:kryo:2.21
[info]     | | | +-com.esotericsoftware.minlog:minlog:1.2
[info]     | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]     | | | | +-org.ow2.asm:asm:4.0
[info]     | | | | 
[info]     | | | +-org.objenesis:objenesis:1.2
[info]     | | | 
[info]     | | +-com.twitter:chill-java:0.6.0
[info]     | | | +-com.esotericsoftware.kryo:kryo:2.21
[info]     | | |   +-com.esotericsoftware.minlog:minlog:1.2
[info]     | | |   +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]     | | |   | +-org.ow2.asm:asm:4.0
[info]     | | |   | 
[info]     | | |   +-org.objenesis:objenesis:1.2
[info]     | | |   
[info]     | | +-org.slf4j:slf4j-api:1.6.6
[info]     | | 
[info]     | +-com.twitter:chill-java:0.6.0
[info]     | | +-com.esotericsoftware.kryo:kryo:2.21
[info]     | |   +-com.esotericsoftware.minlog:minlog:1.2
[info]     | |   +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]     | |   | +-org.ow2.asm:asm:4.0
[info]     | |   | 
[info]     | |   +-org.objenesis:objenesis:1.2
[info]     | |   
[info]     | +-com.twitter:chill_2.11:0.6.0 [S]
[info]     | | +-com.esotericsoftware.kryo:kryo:2.21
[info]     | | | +-com.esotericsoftware.minlog:minlog:1.2
[info]     | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]     | | | | +-org.ow2.asm:asm:4.0
[info]     | | | | 
[info]     | | | +-org.objenesis:objenesis:1.2
[info]     | | | 
[info]     | | +-com.twitter:chill-java:0.6.0
[info]     | |   +-com.esotericsoftware.kryo:kryo:2.21
[info]     | |     +-com.esotericsoftware.minlog:minlog:1.2
[info]     | |     +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info]     | |     | +-org.ow2.asm:asm:4.0
[info]     | |     | 
[info]     | |     +-org.objenesis:objenesis:1.2
[info]     | |     
[info]     | +-com.twitter:maple:0.15.0
[info]     | | +-cascading:cascading-hadoop:2.6.1
[info]     | |   +-cascading:cascading-core:2.6.1
[info]     | |     +-org.codehaus.janino:janino:2.7.5
[info]     | |     | +-org.codehaus.janino:commons-compiler:2.7.5
[info]     | |     | 
[info]     | |     +-riffle:riffle:0.1-dev
[info]     | |     +-thirdparty:jgrapht-jdk1.6:0.8.1
[info]     | |     
[info]     | +-com.twitter:scalding-args_2.11:0.15.0 [S]
[info]     | +-com.twitter:scalding-date_2.11:0.15.0 [S]
[info]     | +-com.twitter:scalding-serialization_2.11:0.15.0 [S]
[info]     | +-org.slf4j:slf4j-api:1.6.6
[info]     | 
[info]     +-org.json4s:json4s-native_2.11:3.2.11 [S]
[info]       +-org.json4s:json4s-core_2.11:3.2.11 [S]
[info]         +-com.thoughtworks.paranamer:paranamer:2.6
[info]         +-org.json4s:json4s-ast_2.11:3.2.11 [S]
[info]         +-org.scala-lang:scalap:2.11.0
[info]           +-org.scala-lang:scala-compiler:2.11.1 [S]
[info]             +-org.scala-lang.modules:scala-parser-combinators_2.11:1.0.1 [S]
[info]             +-org.scala-lang.modules:scala-xml_2.11:1.0.2 [S]
[info]             +-org.scala-lang:scala-reflect:2.11.1 (evicted by: 2.11.2)
[info]             +-org.scala-lang:scala-reflect:2.11.2 [S]
[info]             
[success] Total time: 12 s, completed Jun 28, 2015 11:44:54 AM

Further info as requested:

I build the fat jar using 'sbt assembly', and currently I'm using the AWS console with a "Custom JAR" step to test this out before automating the process.

JAR location: s3://path/to/jar/data-aggregator-0.1.jar
Arguments: com.abc.aggregation.job.DataAggregation --hdfs --input s3n://path/to/input/data/file.json --output s3n://path/to/input/data/file.txt

UPDATE:

I was able to get past the above error by providing the HADOOP_CLASSPATH pointing to the scala 2.11.1 jars, while excluding the same from the sbt assembly step. This was passed in using hadoop-user-env.sh and seemed to work for the master node. However once it got to the mapper step it once again failed with another Scala error. Now I am stuck on this step.

Assuming this is because the mappers and reducers aren't seeing the HADOOP_CLASSPATH update, I tried including the -libjars argument pointing to the scala jar files on hadoop master itself. But this (below) doesn't seem to be working.

 JAR location: s3://path/to/jar/data-aggregator-0.1.jar
Arguments: com.abc.aggregation.job.DataAggregation -libjars /usr/share/scala/lib/scala-library.jar,/usr/share/scala/lib/scala-reflect.jar --hdfs --input s3n://path/to/input/data/file.json --output s3n://path/to/input/data/file.txt

Solution

  • Fixed. So it does happen that there were multiple scala jars in the EMR instances, and they weren't coming from my application jar.

    The 2.10 jar was hiding in /usr/share/aws/emr/emrfs/lib apart from the installed location for the 2.11 binaries under /usr/share/scala. So I got rid of the 2.10 jar in all instances of the cluster, and my job completed successfully. Now I will create a bootstrap action for this.

    $ sudo find / -name "scala-library-2.10.*.jar" -exec rm -rf {} \;
    

    FYI, these are the paths it was present under:

    [ec2-user@ip-172-31-72-130 ~]$ sudo find / -name "scala-library-2.11.*.jar"
    
    /home/hadoop/.versions/hbase-0.94.18/lib/scala-library-2.11.0.jar
    /usr/share/doc/scala/api/jars/scala-library-2.11.1-javadoc.jar
    
    [ec2-user@ip-172-31-72-130 ~]$ sudo find / -name "scala-library-2.10.*.jar"
    
    /usr/share/aws/emr/emrfs/lib/scala-library-2.10.5.jar