I have a problem connecting to my postgresql db in the Spark application that is launching on a cluster of Bluemix Apache-Spark service by using spark-submit.sh script
My code for scala file is
val conf = new SparkConf().setAppName("My demo").setMaster("local")
val sc = new SparkContext(conf)
val sqlContext = new SQLContext(sc)
val driver = "org.postgresql.Driver"
val url = "jdbc:postgresql://aws-us-east-1-portal.16.dblayer.com:10394/tennisdb?user=***&password=***"
println("create")
try {
Class.forName(driver)
val jdbcDF = sqlContext.read.format("jdbc").options(Map("url" -> url, "driver" -> driver, "dbtable" -> "inputdata")).load()
jdbcDF.show()
println("success")
} catch {
case e : Throwable => {
println(e.toString())
println("Exception");
}
}
sc.stop()
I'm using sbt file for resolving the dependencies. The code for sbt file is:
name := "spark-sample"
version := "1.0"
scalaVersion := "2.10.4"
// Adding spark modules dependencies
val sparkModules = List("spark-core",
"spark-streaming",
"spark-sql",
"spark-hive",
"spark-mllib",
"spark-repl",
"spark-graphx"
)
val sparkDeps = sparkModules.map( module => "org.apache.spark" % s"${module}_2.10" % "1.4.0" )
libraryDependencies ++= sparkDeps
libraryDependencies += "org.postgresql" % "postgresql" % "9.4-1201-jdbc41"
Then I use sbt package command for creating a jar for my application to run it on a cluster using Bluemix Apache-Spark service. The jar is created successfully for me and the application runs locally without any errors. But when I submit the application to Bluemix Apache-Spark service using spark-submit.sh script I get ClassNotFoundException for org.postgresql.Driver
One of the other way easy way to do this:- Just put all the library files under the directory where your application jar is and tell spark-submit.sh to look for it.
charles@localhost tweetoneanalyzer]$ spark-submit --jars $(echo application/*.jar | tr ' ' ',') --class "SparkTweets" --master local[3] application/spark-sample.jar
In above example, spark-submit will upload all the jars indicated by --jars flag under application folder to server so you should put any library jars that you would use , in your case(postgresql-9.1-901-1.jdbc4.jar) and specify your application jar to be ran in the later argument application/spark-sample.jar
Thanks,
Charles.