Basically I need to run a scalding job on EMR. The same job runs perfectly fine on local hadoop on my macbook, but fails on Hadoop on EMR.
I am trying hard to get help for this issue in the cascading-user and scala-user groups as well, and haven't been able to. So far I haven't made much progress after trying various changes in the past couple days.
Here is the error before I delve into the details:
Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.ArrowAssoc(Ljava/lang/Object;)Ljava/lang/Object; at com.aggregation.job.DataAggregation$.(DataAggregation.scala:30) at com.aggregation.job.DataAggregation$.(DataAggregation.scala) at com.aggregation.job.DataAggregation.main(DataAggregation.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
They said it might be a scala conflict with the binaries, but I couldn't see anything obvious. It would be great if someone could help with figuring it out.
Environment:
Amazon EMR AMI: 3.8.0 (which includes Scala 2.11.1, Hadoop 2.4.0, Java 1.7.0_76 - AMI details here)
Application environment: Scalding 0.15.0, Scala 2.11.1, Java 1.7.0_80, Hadoop 2.4.0
I have tried various changes to configuration and even manually installing a higher scala version in EMR, but so far the error is not going away.
Please help! Thank you.
L
Setup
build.sbt:
scalaVersion := "2.11.1"
ivyScala := ivyScala.value map {
_.copy(overrideScalaVersion = true)
}
dependencies.sbt:
import AssemblyKeys._
val hadoopVersion = "2.4.0"
val scaldingVersion = "0.15.0"
libraryDependencies ++= Seq(
"com.twitter" %% "scalding-core" % scaldingVersion,
"com.twitter" %% "scalding-json" % scaldingVersion,
"com.twitter" %% "scalding-jdbc" % scaldingVersion,
"com.github.nscala-time" %% "nscala-time" % "2.0.0",
"org.apache.hadoop" % "hadoop-common" % hadoopVersion % "provided",
"org.apache.hadoop" % "hadoop-mapreduce-client-core" % hadoopVersion % "provided"
)
excludedJars in assembly <<= (fullClasspath in assembly) map { cp =>
val excludes = Set(
"jsp-api-2.1-6.1.14.jar",
"jsp-2.1-6.1.14.jar",
"jasper-compiler-5.5.12.jar",
"minlog-1.2.jar", // Otherwise causes conflicts with Kyro (which bundles it)
"janino-2.5.16.jar", // Janino includes a broken signature, and is not needed anyway
"commons-beanutils-core-1.8.0.jar", // Clash with each other and with commons-collections
"commons-beanutils-1.7.0.jar", // "
"hadoop-core-1.2.1.jar", // Provided by Amazon EMR. Delete this line if you're not on EMR
"hadoop-tools-1.2.1.jar" // "
)
cp filter { jar => excludes(jar.data.getName) }
}
resolvers ++= Seq(
"Conjars repo" at "http://conjars.org/repo"
)
assembly.sbt
import AssemblyKeys._
assemblySettings
mergeStrategy in assembly := Merge.mergeStrategy
project/build.properties:
sbt.version=0.13.1
project/assembly.sbt:
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.11.0")
Finally this is the dependency tree using the sbt-dependency-graph plugin to see if I've got the right versions. Sorry, it's quite long, if I should be displaying this info any other way, please recommend suggestions.
sbt dependency tree:
[info] com.abc.aggregator_2.11:0.1-20150628T184441 [S]
[info] +-com.github.nscala-time:nscala-time_2.11:2.0.0 [S]
[info] | +-joda-time:joda-time:2.7
[info] | +-org.joda:joda-convert:1.2
[info] |
[info] +-com.twitter:scalding-core_2.11:0.15.0 [S]
[info] | +-cascading:cascading-core:2.6.1
[info] | | +-org.codehaus.janino:janino:2.7.5
[info] | | | +-org.codehaus.janino:commons-compiler:2.7.5
[info] | | |
[info] | | +-riffle:riffle:0.1-dev
[info] | | +-thirdparty:jgrapht-jdk1.6:0.8.1
[info] | |
[info] | +-cascading:cascading-hadoop:2.6.1
[info] | | +-cascading:cascading-core:2.6.1
[info] | | +-org.codehaus.janino:janino:2.7.5
[info] | | | +-org.codehaus.janino:commons-compiler:2.7.5
[info] | | |
[info] | | +-riffle:riffle:0.1-dev
[info] | | +-thirdparty:jgrapht-jdk1.6:0.8.1
[info] | |
[info] | +-cascading:cascading-local:2.6.1
[info] | | +-cascading:cascading-core:2.6.1
[info] | | | +-org.codehaus.janino:janino:2.7.5
[info] | | | | +-org.codehaus.janino:commons-compiler:2.7.5
[info] | | | |
[info] | | | +-riffle:riffle:0.1-dev
[info] | | | +-thirdparty:jgrapht-jdk1.6:0.8.1
[info] | | |
[info] | | +-com.google.guava:guava:14.0.1 (evicted by: 15.0)
[info] | | +-com.google.guava:guava:15.0
[info] | | +-org.slf4j:slf4j-api:1.6.6
[info] | | +-org.slf4j:slf4j-api:1.7.2 (evicted by: 1.6.6)
[info] | |
[info] | +-com.twitter:algebird-core_2.11:0.10.1 [S]
[info] | | +-com.googlecode.javaewah:JavaEWAH:0.6.6
[info] | |
[info] | +-com.twitter:bijection-core_2.11:0.8.0 [S]
[info] | +-com.twitter:chill-algebird_2.11:0.6.0 [S]
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | | +-org.ow2.asm:asm:4.0
[info] | | | |
[info] | | | +-org.objenesis:objenesis:1.2
[info] | | |
[info] | | +-com.twitter:algebird-core_2.11:0.10.0 (evicted by: 0.10.1)
[info] | | +-com.twitter:algebird-core_2.11:0.10.1 [S]
[info] | | | +-com.googlecode.javaewah:JavaEWAH:0.6.6
[info] | | |
[info] | | +-com.twitter:chill_2.11:0.6.0 [S]
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | | +-org.ow2.asm:asm:4.0
[info] | | | |
[info] | | | +-org.objenesis:objenesis:1.2
[info] | | |
[info] | | +-com.twitter:chill-java:0.6.0
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | +-org.ow2.asm:asm:4.0
[info] | | |
[info] | | +-org.objenesis:objenesis:1.2
[info] | |
[info] | +-com.twitter:chill-hadoop:0.6.0
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | | +-org.ow2.asm:asm:4.0
[info] | | | |
[info] | | | +-org.objenesis:objenesis:1.2
[info] | | |
[info] | | +-com.twitter:chill-java:0.6.0
[info] | | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | | +-org.ow2.asm:asm:4.0
[info] | | | |
[info] | | | +-org.objenesis:objenesis:1.2
[info] | | |
[info] | | +-org.slf4j:slf4j-api:1.6.6
[info] | |
[info] | +-com.twitter:chill-java:0.6.0
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | +-org.ow2.asm:asm:4.0
[info] | | |
[info] | | +-org.objenesis:objenesis:1.2
[info] | |
[info] | +-com.twitter:chill_2.11:0.6.0 [S]
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | | +-org.ow2.asm:asm:4.0
[info] | | | |
[info] | | | +-org.objenesis:objenesis:1.2
[info] | | |
[info] | | +-com.twitter:chill-java:0.6.0
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | +-org.ow2.asm:asm:4.0
[info] | | |
[info] | | +-org.objenesis:objenesis:1.2
[info] | |
[info] | +-com.twitter:maple:0.15.0
[info] | | +-cascading:cascading-hadoop:2.6.1
[info] | | +-cascading:cascading-core:2.6.1
[info] | | +-org.codehaus.janino:janino:2.7.5
[info] | | | +-org.codehaus.janino:commons-compiler:2.7.5
[info] | | |
[info] | | +-riffle:riffle:0.1-dev
[info] | | +-thirdparty:jgrapht-jdk1.6:0.8.1
[info] | |
[info] | +-com.twitter:scalding-args_2.11:0.15.0 [S]
[info] | +-com.twitter:scalding-date_2.11:0.15.0 [S]
[info] | +-com.twitter:scalding-serialization_2.11:0.15.0 [S]
[info] | +-org.slf4j:slf4j-api:1.6.6
[info] |
[info] +-com.twitter:scalding-jdbc_2.11:0.15.0 [S]
[info] | +-cascading:cascading-jdbc-core:2.6.0
[info] | | +-com.google.guava:guava:15.0
[info] | |
[info] | +-cascading:cascading-jdbc-mysql:2.6.0
[info] | | +-cascading:cascading-jdbc-core:2.6.0
[info] | | | +-com.google.guava:guava:15.0
[info] | | |
[info] | | +-com.google.guava:guava:15.0
[info] | | +-mysql:mysql-connector-java:5.1.25
[info] | |
[info] | +-com.twitter:scalding-core_2.11:0.15.0 [S]
[info] | +-cascading:cascading-core:2.6.1
[info] | | +-org.codehaus.janino:janino:2.7.5
[info] | | | +-org.codehaus.janino:commons-compiler:2.7.5
[info] | | |
[info] | | +-riffle:riffle:0.1-dev
[info] | | +-thirdparty:jgrapht-jdk1.6:0.8.1
[info] | |
[info] | +-cascading:cascading-hadoop:2.6.1
[info] | | +-cascading:cascading-core:2.6.1
[info] | | +-org.codehaus.janino:janino:2.7.5
[info] | | | +-org.codehaus.janino:commons-compiler:2.7.5
[info] | | |
[info] | | +-riffle:riffle:0.1-dev
[info] | | +-thirdparty:jgrapht-jdk1.6:0.8.1
[info] | |
[info] | +-cascading:cascading-local:2.6.1
[info] | | +-cascading:cascading-core:2.6.1
[info] | | | +-org.codehaus.janino:janino:2.7.5
[info] | | | | +-org.codehaus.janino:commons-compiler:2.7.5
[info] | | | |
[info] | | | +-riffle:riffle:0.1-dev
[info] | | | +-thirdparty:jgrapht-jdk1.6:0.8.1
[info] | | |
[info] | | +-com.google.guava:guava:14.0.1 (evicted by: 15.0)
[info] | | +-com.google.guava:guava:15.0
[info] | | +-org.slf4j:slf4j-api:1.6.6
[info] | | +-org.slf4j:slf4j-api:1.7.2 (evicted by: 1.6.6)
[info] | |
[info] | +-com.twitter:algebird-core_2.11:0.10.1 [S]
[info] | | +-com.googlecode.javaewah:JavaEWAH:0.6.6
[info] | |
[info] | +-com.twitter:bijection-core_2.11:0.8.0 [S]
[info] | +-com.twitter:chill-algebird_2.11:0.6.0 [S]
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | | +-org.ow2.asm:asm:4.0
[info] | | | |
[info] | | | +-org.objenesis:objenesis:1.2
[info] | | |
[info] | | +-com.twitter:algebird-core_2.11:0.10.0 (evicted by: 0.10.1)
[info] | | +-com.twitter:algebird-core_2.11:0.10.1 [S]
[info] | | | +-com.googlecode.javaewah:JavaEWAH:0.6.6
[info] | | |
[info] | | +-com.twitter:chill_2.11:0.6.0 [S]
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | | +-org.ow2.asm:asm:4.0
[info] | | | |
[info] | | | +-org.objenesis:objenesis:1.2
[info] | | |
[info] | | +-com.twitter:chill-java:0.6.0
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | +-org.ow2.asm:asm:4.0
[info] | | |
[info] | | +-org.objenesis:objenesis:1.2
[info] | |
[info] | +-com.twitter:chill-hadoop:0.6.0
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | | +-org.ow2.asm:asm:4.0
[info] | | | |
[info] | | | +-org.objenesis:objenesis:1.2
[info] | | |
[info] | | +-com.twitter:chill-java:0.6.0
[info] | | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | | +-org.ow2.asm:asm:4.0
[info] | | | |
[info] | | | +-org.objenesis:objenesis:1.2
[info] | | |
[info] | | +-org.slf4j:slf4j-api:1.6.6
[info] | |
[info] | +-com.twitter:chill-java:0.6.0
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | +-org.ow2.asm:asm:4.0
[info] | | |
[info] | | +-org.objenesis:objenesis:1.2
[info] | |
[info] | +-com.twitter:chill_2.11:0.6.0 [S]
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | | +-org.ow2.asm:asm:4.0
[info] | | | |
[info] | | | +-org.objenesis:objenesis:1.2
[info] | | |
[info] | | +-com.twitter:chill-java:0.6.0
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | +-org.ow2.asm:asm:4.0
[info] | | |
[info] | | +-org.objenesis:objenesis:1.2
[info] | |
[info] | +-com.twitter:maple:0.15.0
[info] | | +-cascading:cascading-hadoop:2.6.1
[info] | | +-cascading:cascading-core:2.6.1
[info] | | +-org.codehaus.janino:janino:2.7.5
[info] | | | +-org.codehaus.janino:commons-compiler:2.7.5
[info] | | |
[info] | | +-riffle:riffle:0.1-dev
[info] | | +-thirdparty:jgrapht-jdk1.6:0.8.1
[info] | |
[info] | +-com.twitter:scalding-args_2.11:0.15.0 [S]
[info] | +-com.twitter:scalding-date_2.11:0.15.0 [S]
[info] | +-com.twitter:scalding-serialization_2.11:0.15.0 [S]
[info] | +-org.slf4j:slf4j-api:1.6.6
[info] |
[info] +-com.twitter:scalding-json_2.11:0.15.0 [S]
[info] +-com.fasterxml.jackson.module:jackson-module-scala_2.11:2.4.2 [S]
[info] | +-com.fasterxml.jackson.core:jackson-annotations:2.4.2
[info] | +-com.fasterxml.jackson.core:jackson-core:2.4.2
[info] | +-com.fasterxml.jackson.core:jackson-databind:2.4.2
[info] | | +-com.fasterxml.jackson.core:jackson-annotations:2.4.0 (evicted by: 2.4.2)
[info] | | +-com.fasterxml.jackson.core:jackson-annotations:2.4.2
[info] | | +-com.fasterxml.jackson.core:jackson-core:2.4.2
[info] | |
[info] | +-com.google.code.findbugs:jsr305:2.0.1
[info] | +-com.google.guava:guava:15.0
[info] | +-com.thoughtworks.paranamer:paranamer:2.6
[info] | +-org.scala-lang:scala-reflect:2.11.2 [S]
[info] |
[info] +-com.twitter:scalding-core_2.11:0.15.0 [S]
[info] | +-cascading:cascading-core:2.6.1
[info] | | +-org.codehaus.janino:janino:2.7.5
[info] | | | +-org.codehaus.janino:commons-compiler:2.7.5
[info] | | |
[info] | | +-riffle:riffle:0.1-dev
[info] | | +-thirdparty:jgrapht-jdk1.6:0.8.1
[info] | |
[info] | +-cascading:cascading-hadoop:2.6.1
[info] | | +-cascading:cascading-core:2.6.1
[info] | | +-org.codehaus.janino:janino:2.7.5
[info] | | | +-org.codehaus.janino:commons-compiler:2.7.5
[info] | | |
[info] | | +-riffle:riffle:0.1-dev
[info] | | +-thirdparty:jgrapht-jdk1.6:0.8.1
[info] | |
[info] | +-cascading:cascading-local:2.6.1
[info] | | +-cascading:cascading-core:2.6.1
[info] | | | +-org.codehaus.janino:janino:2.7.5
[info] | | | | +-org.codehaus.janino:commons-compiler:2.7.5
[info] | | | |
[info] | | | +-riffle:riffle:0.1-dev
[info] | | | +-thirdparty:jgrapht-jdk1.6:0.8.1
[info] | | |
[info] | | +-com.google.guava:guava:14.0.1 (evicted by: 15.0)
[info] | | +-com.google.guava:guava:15.0
[info] | | +-org.slf4j:slf4j-api:1.6.6
[info] | | +-org.slf4j:slf4j-api:1.7.2 (evicted by: 1.6.6)
[info] | |
[info] | +-com.twitter:algebird-core_2.11:0.10.1 [S]
[info] | | +-com.googlecode.javaewah:JavaEWAH:0.6.6
[info] | |
[info] | +-com.twitter:bijection-core_2.11:0.8.0 [S]
[info] | +-com.twitter:chill-algebird_2.11:0.6.0 [S]
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | | +-org.ow2.asm:asm:4.0
[info] | | | |
[info] | | | +-org.objenesis:objenesis:1.2
[info] | | |
[info] | | +-com.twitter:algebird-core_2.11:0.10.0 (evicted by: 0.10.1)
[info] | | +-com.twitter:algebird-core_2.11:0.10.1 [S]
[info] | | | +-com.googlecode.javaewah:JavaEWAH:0.6.6
[info] | | |
[info] | | +-com.twitter:chill_2.11:0.6.0 [S]
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | | +-org.ow2.asm:asm:4.0
[info] | | | |
[info] | | | +-org.objenesis:objenesis:1.2
[info] | | |
[info] | | +-com.twitter:chill-java:0.6.0
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | +-org.ow2.asm:asm:4.0
[info] | | |
[info] | | +-org.objenesis:objenesis:1.2
[info] | |
[info] | +-com.twitter:chill-hadoop:0.6.0
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | | +-org.ow2.asm:asm:4.0
[info] | | | |
[info] | | | +-org.objenesis:objenesis:1.2
[info] | | |
[info] | | +-com.twitter:chill-java:0.6.0
[info] | | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | | +-org.ow2.asm:asm:4.0
[info] | | | |
[info] | | | +-org.objenesis:objenesis:1.2
[info] | | |
[info] | | +-org.slf4j:slf4j-api:1.6.6
[info] | |
[info] | +-com.twitter:chill-java:0.6.0
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | +-org.ow2.asm:asm:4.0
[info] | | |
[info] | | +-org.objenesis:objenesis:1.2
[info] | |
[info] | +-com.twitter:chill_2.11:0.6.0 [S]
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | | +-org.ow2.asm:asm:4.0
[info] | | | |
[info] | | | +-org.objenesis:objenesis:1.2
[info] | | |
[info] | | +-com.twitter:chill-java:0.6.0
[info] | | +-com.esotericsoftware.kryo:kryo:2.21
[info] | | +-com.esotericsoftware.minlog:minlog:1.2
[info] | | +-com.esotericsoftware.reflectasm:reflectasm:1.07
[info] | | | +-org.ow2.asm:asm:4.0
[info] | | |
[info] | | +-org.objenesis:objenesis:1.2
[info] | |
[info] | +-com.twitter:maple:0.15.0
[info] | | +-cascading:cascading-hadoop:2.6.1
[info] | | +-cascading:cascading-core:2.6.1
[info] | | +-org.codehaus.janino:janino:2.7.5
[info] | | | +-org.codehaus.janino:commons-compiler:2.7.5
[info] | | |
[info] | | +-riffle:riffle:0.1-dev
[info] | | +-thirdparty:jgrapht-jdk1.6:0.8.1
[info] | |
[info] | +-com.twitter:scalding-args_2.11:0.15.0 [S]
[info] | +-com.twitter:scalding-date_2.11:0.15.0 [S]
[info] | +-com.twitter:scalding-serialization_2.11:0.15.0 [S]
[info] | +-org.slf4j:slf4j-api:1.6.6
[info] |
[info] +-org.json4s:json4s-native_2.11:3.2.11 [S]
[info] +-org.json4s:json4s-core_2.11:3.2.11 [S]
[info] +-com.thoughtworks.paranamer:paranamer:2.6
[info] +-org.json4s:json4s-ast_2.11:3.2.11 [S]
[info] +-org.scala-lang:scalap:2.11.0
[info] +-org.scala-lang:scala-compiler:2.11.1 [S]
[info] +-org.scala-lang.modules:scala-parser-combinators_2.11:1.0.1 [S]
[info] +-org.scala-lang.modules:scala-xml_2.11:1.0.2 [S]
[info] +-org.scala-lang:scala-reflect:2.11.1 (evicted by: 2.11.2)
[info] +-org.scala-lang:scala-reflect:2.11.2 [S]
[info]
[success] Total time: 12 s, completed Jun 28, 2015 11:44:54 AM
Further info as requested:
I build the fat jar using 'sbt assembly', and currently I'm using the AWS console with a "Custom JAR" step to test this out before automating the process.
JAR location: s3://path/to/jar/data-aggregator-0.1.jar
Arguments: com.abc.aggregation.job.DataAggregation --hdfs --input s3n://path/to/input/data/file.json --output s3n://path/to/input/data/file.txt
UPDATE:
I was able to get past the above error by providing the HADOOP_CLASSPATH pointing to the scala 2.11.1 jars, while excluding the same from the sbt assembly step. This was passed in using hadoop-user-env.sh and seemed to work for the master node. However once it got to the mapper step it once again failed with another Scala error. Now I am stuck on this step.
Assuming this is because the mappers and reducers aren't seeing the HADOOP_CLASSPATH update, I tried including the -libjars argument pointing to the scala jar files on hadoop master itself. But this (below) doesn't seem to be working.
JAR location: s3://path/to/jar/data-aggregator-0.1.jar
Arguments: com.abc.aggregation.job.DataAggregation -libjars /usr/share/scala/lib/scala-library.jar,/usr/share/scala/lib/scala-reflect.jar --hdfs --input s3n://path/to/input/data/file.json --output s3n://path/to/input/data/file.txt
Fixed. So it does happen that there were multiple scala jars in the EMR instances, and they weren't coming from my application jar.
The 2.10 jar was hiding in /usr/share/aws/emr/emrfs/lib apart from the installed location for the 2.11 binaries under /usr/share/scala. So I got rid of the 2.10 jar in all instances of the cluster, and my job completed successfully. Now I will create a bootstrap action for this.
$ sudo find / -name "scala-library-2.10.*.jar" -exec rm -rf {} \;
FYI, these are the paths it was present under:
[ec2-user@ip-172-31-72-130 ~]$ sudo find / -name "scala-library-2.11.*.jar"
/home/hadoop/.versions/hbase-0.94.18/lib/scala-library-2.11.0.jar
/usr/share/doc/scala/api/jars/scala-library-2.11.1-javadoc.jar
[ec2-user@ip-172-31-72-130 ~]$ sudo find / -name "scala-library-2.10.*.jar"
/usr/share/aws/emr/emrfs/lib/scala-library-2.10.5.jar