Search code examples
apache-sparksbtshufflejava-opts

run spark application locally via sbt


I want to run a spark job locally for testing. If spark-submit and a assembled jar are used it works just fine.

However if sbt run is used I get a very strange error https://gist.github.com/geoHeil/946dd7706f44f338101c8332f4e13c1a

Trying to set java-opts like

javaOptions ++= Seq("-Xms512M", "-Xmx2048M", "-XX:MaxPermSize=2048M", "-XX:+CMSClassUnloadingEnabled")

Did not help to solve the problem.

Trying to fiddle with memory settings in local[*] mode like

.set("spark.executor.memory", "7g")
.set("spark.driver.memory", "7g")

did only spawn further problems of ExecutorLostFailure


Solution

  • I never ran into this issue specifically but I think spark code is not made to be run with sbt run. I even remember reading about it in the docs but could not find it as of now.

    I guess what you should do instead is compiling it with sbt and use spark-submit instead.