I have an sbt project that I am trying to build into a jar with the sbt-assembly plugin.
build.sbt:
name := "project-name"
version := "0.1"
scalaVersion := "2.11.12"
val sparkVersion = "2.4.0"
libraryDependencies ++= Seq(
"org.scalatest" %% "scalatest" % "3.0.5" % "test",
"org.apache.spark" %% "spark-core" % sparkVersion % "provided",
"org.apache.spark" %% "spark-sql" % sparkVersion % "provided",
"org.apache.spark" %% "spark-streaming" % sparkVersion % "provided",
"com.holdenkarau" %% "spark-testing-base" % "2.3.1_0.10.0" % "test",
// spark-hive dependencies for DataFrameSuiteBase. https://github.com/holdenk/spark-testing-base/issues/143
"org.apache.spark" %% "spark-hive" % sparkVersion % "provided",
"com.amazonaws" % "aws-java-sdk" % "1.11.513" % "provided",
"com.amazonaws" % "aws-java-sdk-sqs" % "1.11.513" % "provided",
"com.amazonaws" % "aws-java-sdk-s3" % "1.11.513" % "provided",
//"org.apache.hadoop" % "hadoop-aws" % "3.1.1"
"org.json" % "json" % "20180813"
)
assemblyOption in assembly := (assemblyOption in assembly).value.copy(includeScala = false)
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs @ _*) => MergeStrategy.discard
case x => MergeStrategy.first
}
test in assembly := {}
// https://github.com/holdenk/spark-testing-base
fork in Test := true
javaOptions ++= Seq("-Xms512M", "-Xmx2048M", "-XX:MaxPermSize=2048M", "-XX:+CMSClassUnloadingEnabled")
parallelExecution in Test := false
When I build the project with sbt assembly, the resulting jar contains /org/junit/... and /org/opentest4j/... files
Is there any way to not include these test related files in the final jar?
I have tried replacing the line:
"org.scalatest" %% "scalatest" % "3.0.5" % "test"
with:
"org.scalatest" %% "scalatest" % "3.0.5" % "provided"
I am also wondering how the files are included in the jar as junit is not referenced inside build.sbt (there are junit tests in the project however)?
Updated:
name := "project-name"
version := "0.1"
scalaVersion := "2.11.12"
val sparkVersion = "2.4.0"
val excludeJUnitBinding = ExclusionRule(organization = "junit")
libraryDependencies ++= Seq(
// Provided
"org.apache.spark" %% "spark-core" % sparkVersion % "provided" excludeAll(excludeJUnitBinding),
"org.apache.spark" %% "spark-sql" % sparkVersion % "provided" excludeAll(excludeJUnitBinding),
"org.apache.spark" %% "spark-streaming" % sparkVersion % "provided",
"com.holdenkarau" %% "spark-testing-base" % "2.3.1_0.10.0" % "provided" excludeAll(excludeJUnitBinding),
"org.apache.spark" %% "spark-hive" % sparkVersion % "provided",
"com.amazonaws" % "aws-java-sdk" % "1.11.513" % "provided",
"com.amazonaws" % "aws-java-sdk-sqs" % "1.11.513" % "provided",
"com.amazonaws" % "aws-java-sdk-s3" % "1.11.513" % "provided",
// Test
"org.scalatest" %% "scalatest" % "3.0.5" % "test",
// Necessary
"org.json" % "json" % "20180813"
)
excludeDependencies += excludeJUnitBinding
// https://stackoverflow.com/questions/25144484/sbt-assembly-deduplication-found-error
assemblyOption in assembly := (assemblyOption in assembly).value.copy(includeScala = false)
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs @ _*) => MergeStrategy.discard
case x => MergeStrategy.first
}
// https://github.com/holdenk/spark-testing-base
fork in Test := true
javaOptions ++= Seq("-Xms512M", "-Xmx2048M", "-XX:MaxPermSize=2048M", "-XX:+CMSClassUnloadingEnabled")
parallelExecution in Test := false
To exclude certain transitive dependencies of a dependency, use the excludeAll
or exclude
methods.
The exclude
method should be used when a pom will be published for the project. It requires the organization and module name to exclude.
For example:
libraryDependencies +=
"log4j" % "log4j" % "1.2.15" exclude("javax.jms", "jms")
The excludeAll method is more flexible, but because it cannot be represented in a pom.xml, it should only be used when a pom doesn’t need to be generated.
For example,
libraryDependencies +=
"log4j" % "log4j" % "1.2.15" excludeAll(
ExclusionRule(organization = "com.sun.jdmk"),
ExclusionRule(organization = "com.sun.jmx"),
ExclusionRule(organization = "javax.jms")
)
In certain cases a transitive dependency should be excluded from all dependencies. This can be achieved by setting up ExclusionRules in excludeDependencies(For sbt 0.13.8 and above).
excludeDependencies ++= Seq(
ExclusionRule("commons-logging", "commons-logging")
)
JUnit jar file downloads as part of below dependencies.
"org.apache.spark" %% "spark-core" % sparkVersion % "provided" //(junit)
"org.apache.spark" %% "spark-sql" % sparkVersion % "provided"// (junit)
"com.holdenkarau" %% "spark-testing-base" % "2.3.1_0.10.0" % "test" //(org.junit)
To exclude junit file please update your dependency as below.
val excludeJUnitBinding = ExclusionRule(organization = "junit")
"org.scalatest" %% "scalatest" % "3.0.5" % "test",
"org.apache.spark" %% "spark-core" % sparkVersion % "provided" excludeAll(excludeJUnitBinding),
"org.apache.spark" %% "spark-sql" % sparkVersion % "provided" excludeAll(excludeJUnitBinding),
"org.apache.spark" %% "spark-streaming" % sparkVersion % "provided",
"com.holdenkarau" %% "spark-testing-base" % "2.3.1_0.10.0" % "test" excludeAll(excludeJUnitBinding)
Update: Please update your build.abt as below.
resolvers += Resolver.url("bintray-sbt-plugins",
url("https://dl.bintray.com/eed3si9n/sbt-plugins/"))(Resolver.ivyStylePatterns)
val excludeJUnitBinding = ExclusionRule(organization = "junit")
libraryDependencies ++= Seq(
// Provided
"org.apache.spark" %% "spark-core" % sparkVersion % "provided" excludeAll(excludeJUnitBinding),
"org.apache.spark" %% "spark-sql" % sparkVersion % "provided" excludeAll(excludeJUnitBinding),
"org.apache.spark" %% "spark-streaming" % sparkVersion % "provided",
"com.holdenkarau" %% "spark-testing-base" % "2.3.1_0.10.0" % "provided" excludeAll(excludeJUnitBinding),
"org.apache.spark" %% "spark-hive" % sparkVersion % "provided",
//"com.amazonaws" % "aws-java-sdk" % "1.11.513" % "provided",
//"com.amazonaws" % "aws-java-sdk-sqs" % "1.11.513" % "provided",
//"com.amazonaws" % "aws-java-sdk-s3" % "1.11.513" % "provided",
// Test
"org.scalatest" %% "scalatest" % "3.0.5" % "test",
// Necessary
"org.json" % "json" % "20180813"
)
assemblyOption in assembly := (assemblyOption in assembly).value.copy(includeScala = false)
assemblyMergeStrategy in assembly := {
case PathList("META-INF", xs @ _*) => MergeStrategy.discard
case x => MergeStrategy.first
}
fork in Test := true
javaOptions ++= Seq("-Xms512M", "-Xmx2048M", "-XX:MaxPermSize=2048M", "-XX:+CMSClassUnloadingEnabled")
parallelExecution in Test := false
plugin.sbt
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.13.0")