Search code examples
scalaapache-sparksbtsbt-assembly

dynamically changing library dependencies in sbt build file from provided etc


We use spark a lot for our scala applications. If I'm testing locally my library dependencies are:

  libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "1.6.1",
 libraryDependencies += "org.apache.spark" % "spark-sql_2.10" % "1.6.1" ,

whereas is I'm building a jar to deploy I use:

  libraryDependencies += "org.apache.spark" % "spark-core_2.10" % "1.6.1" % "provided",
libraryDependencies += "org.apache.spark" % "spark-sql_2.10" % "1.6.1" % "provided",

Due to the nature of the work we may sometimes have to flip back and forth a few times while trying different things. It's inevitable that at some point I forget to change the build file and end up wasting time, its not a lot of time but enough to prompt me into asking this question.

So, is anyone aware of a way (excluding remembering to 'do it right') of having the build file update the provided value depending on a trigger? Perhaps a configuration option that reads test or live for example?

Thanks in advance.


Solution

  • I have just performed the dynamic build with two different spark version in my example. I need to use two different version based on specific condition.

    You can do that in two ways. As you need to provide input in one or other way, so you need to use command line parameters.

    1) using build.sbt it self.

    a) you can define a parameter with the name "sparkVersion"

    b) read that parameter in build.sbt, (you can write scala code in build.sbt, and it gets compiled to scala any way in build time.)

    c) perform the conditional based based dependencies as below.

    val sparkVersion = Option(System.getProperty("sparkVersion")).getOrElse("default")
    
    if(sparkVersion == "newer"){
        println(" newer one");
        libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.0" 
    }else{
        println(" default one");
        libraryDependencies += "org.apache.spark" %% "spark-core" % "1.6.0" % "provided"
    }
    

    you can play with all the build options at your will.

    2) Using build.scala file. You can create a build.scala file in /project/build.scala

    you can write the code below.

    import sbt._
    import Keys._
    
    object MyBuild extends Build {  
      val myOptionValue = Option(System.getProperty("scalaTestVersion")).getOrElse("defaultValue")
    
      val depVersion = if(myOptionValue == "newer"){
        println(" asked for newer version" );
        "2.2.6"
      }else{
        println(" asked for older/default version" );
        "2.2.0"
      }
    
       val dependencies = Seq(
        "org.scalatest" %% "scalatest" % depVersion % "test"
      )
    
       lazy val exampleProject = Project("SbtExample", file(".")).settings(
        version       := "1.2",
        scalaVersion  := "2.10.4",
        libraryDependencies ++= dependencies
      )
    
    }
    

    After this, just run the build command as below.

    sbt clean compile -DsparkVersion=newer -DscalaTestVersion=newer

    I have given build command for both. You can choose either one and give only one option. Please write to me, if you need any help.

    For resolving duplicates in build you can add below one in build.sbt

    mergeStrategy in assembly := {
      case m if m.toLowerCase.endsWith("manifest.mf")          => MergeStrategy.discard
      case m if m.toLowerCase.matches("meta-inf.*\\.sf$")      => MergeStrategy.discard
      case "log4j.properties"                                  => MergeStrategy.discard
      case "log4j-defaults.properties"                         => MergeStrategy.discard
      case m if m.toLowerCase.startsWith("meta-inf/services/") => MergeStrategy.filterDistinctLines
      case "reference.conf"                                    => MergeStrategy.concat
      case _                                                   => MergeStrategy.first
    }
    

    You will understand how good and magical sbt is with this.