At work we use gradle on a Scalding project and I'm trying to come up with the simplest job to get the hand out of the stack.
My class looks as :
package org.playground
import com.twitter.scalding._
class readCsv(args: Args) extends Job(args) {
val csv:Csv = Csv(args("input"), ("firstName", "lastName"))
println(csv)
}
and lives in playground/src/org/playground/readCsv.scala
. My build script looks like this:
apply plugin: 'scala'
archivesBaseName = 'playground'
mainClassName = 'org.playground.readCsv'
repositories {
mavenLocal()
mavenCentral()
maven{
url 'http://conjars.org/repo/'
artifactUrls 'http://clojars.org/repo/'
artifactUrls 'http://maven.twttr.com/'
}
}
dependencies {
compile 'org.scala-lang:scala-compiler:2.9.2'
compile 'org.scala-lang:scala-library:2.9.2'
compile 'bixo:bixo-core:0.9.1'
compile 'org.apache.hadoop:hadoop-core:1.2.1'
compile 'com.twitter:scalding_2.9.2:0.8.1'
compile 'cascading:cascading-core:2.1.6'
compile 'cascading:cascading-hadoop:2.1.6'
testCompile 'org.testng:testng:6.8.7'
testCompile 'org.scala-tools.testing:specs:1.6.2.2_1.5.0'
}
test {
useTestNG()
}
jar {
description = "Assembles a Hadoop-ready JAR file"
manifest {
attributes( "Main-Class": "org.playground.readCsv" )
}
}
This compiles and builds successfully but trying to run the jar throws this error:
$ java -jar build/libs/playground.jar
Exception in thread "main" java.lang.NoClassDefFoundError: org/playground/readCsv
Caused by: java.lang.ClassNotFoundException: org.playground.readCsv
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
My educated guess is that having the job extend from Job
fails to conform some convention and doesn't look like a valid Main-Class, but I won't expect it to complain about not finding it.
Other possibility is that running it as java -jar jarname
is incorrect and I just need run it with hadoop or something along those lines.
Anyway and just to validate: What is wrong with my setup?
The source file is in the wrong location. By default, it needs to go into src/main/scala/org/playground/readCsv.scala
. Otherwise, it won't even get compiled.