java scala apache-spark sbt sbt-assembly

How to build a truly local Apache Spark "fat" jar. JRE memory issues?

Long story short: I have an app that uses Spark dataframes and machine learning, AND ScalaFX for the front-end. I'd like to create a massive 'fat' jar so that it runs in any machine with a JVM.

I am familiar with the assembly sbt plugin, having researched ways of assembling a jar for hours. Below is my build.sbt:

lazy val root = (project in file(".")).
  settings(
  scalaVersion := "2.11.8",
  mainClass in assembly := Some("me.projects.MyProject.Main"),
  assemblyJarName in assembly := "MyProject_2.0.jar",
  test in assembly := {}
  )

libraryDependencies += "org.apache.spark" %% "spark-core" % "2.0.0" withSources() withJavadoc()
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.0.0" withSources() withJavadoc()
libraryDependencies += "org.apache.spark" %% "spark-mllib" % "2.0.2" withSources() withJavadoc()
libraryDependencies += "joda-time" % "joda-time" % "2.9.4" withJavadoc()
libraryDependencies += "org.scalactic" %% "scalactic" % "3.0.1" % "provided"
libraryDependencies += "org.scalatest" %% "scalatest" % "3.0.1" % "test"
libraryDependencies += "org.scalafx" %% "scalafx" % "8.0.92-R10" withSources() withJavadoc()
libraryDependencies += "net.liftweb" %% "lift-json" % "2.6+" withSources() withJavadoc()

EclipseKeys.withSource := true
EclipseKeys.withJavadoc := true

// META-INF discarding
assemblyMergeStrategy in assembly := {
  case PathList("org","aopalliance", xs @ _*) => MergeStrategy.last
  case PathList("javax", "inject", xs @ _*) => MergeStrategy.last
  case PathList("javax", "servlet", xs @ _*) => MergeStrategy.last
  case PathList("javax", "activation", xs @ _*) => MergeStrategy.last
  case PathList("org", "apache", xs @ _*) => MergeStrategy.last
  case PathList("com", "google", xs @ _*) => MergeStrategy.last
  case PathList("com", "esotericsoftware", xs @ _*) => MergeStrategy.last
  case PathList("com", "codahale", xs @ _*) => MergeStrategy.last
  case PathList("com", "yammer", xs @ _*) => MergeStrategy.last
  case "about.html" => MergeStrategy.rename
  case "META-INF/ECLIPSEF.RSA" => MergeStrategy.last
  case "META-INF/mailcap" => MergeStrategy.last
  case "META-INF/mimetypes.default" => MergeStrategy.last
  case "plugin.properties" => MergeStrategy.last
  case "log4j.properties" => MergeStrategy.last
  case x =>
    val oldStrategy = (assemblyMergeStrategy in assembly).value
    oldStrategy(x)
}

This runs fine on my Linux machine, which has spark installed and configured. Before I've taken ScalaFX assembled jars and opened them in a Windows machine with no issues. However, this application, which also uses Spark, gives the following:

ERROR SparkContext: Error initializing SparkContext.
java.lang.IllegalArgumentException: System memory 259522560 must be at least 471859200. Please increase the heap size using the --driver-memory option or spark.driver.memory in Spark configuration.

Things I have tried:

To include/not include % "provided" in the spark dependencies in the build sbt
To add bigger and bigger numbers in -Xms, in Runtime Parameters, in the Windows' machine Java Runtime Environment Settings.
To set different values for spark.executor.driver/memory when creating the SparkConf (in the scala code), like this:

.set("spark.executor.memory", "12g") .set("spark.executor.driver", "5g") .set("spark.driver.memory","5g")

The application works fine otherwise (when run in Scala IDE, when run using spark-submit, when opening the assembled jar in linux).

Please let me know if this is possible. This is a small project that uses a GUI (ScalaFX) to run a couple of machine learning operations on some data (Spark). Hence the dependencies above.

Again, I am not looking to set up a cluster or anything of the like. I'd like to access the Spark functionality just by running the jar on any computer with JRE. This a small project to-be-showcased.

Solution

Turns out it was a rather generic JVM issue. Instead of just adding runtime parameters, I have solved this by adding a new environment variable to the Windows system:

name: _JAVA_OPTIONS value: -Xms512M -Xmx1024M