Search code examples
apache-sparkh2osparkling-water

Building a minimal Sparkling Water application


I am new to the sparkling water. I now how to run my program from sparkling-shell. However, I am not sure how to build a standalone application that I can give as an input to spark submit. What are the jars that I need to include to build my application?


Solution

  • Check sparkling-water examples e.g. ProstateDemo.scala to how to write standalone sparkling-water app (creating h2o context, etc.).

    Basically you need to add sparkling-water-core to your sbt/maven/gradle dependency, compile your jar. You have 2 options:

    1. Build an assembly jar with sparkling-water-core in it. Here's an example i'm using for sbt:

      libraryDependencies += "ai.h2o" %% "sparkling-water-core" % "2.0.4" excludeAll(
      ExclusionRule(organization = "org.apache.spark"),
      ExclusionRule(organization = "org.slf4j"),
      ExclusionRule(organization = "com.google.guava"),
      ExclusionRule(organization = "org.eclipse.jetty.orbit"),
      ExclusionRule(organization = "com.esotericsoftware.kryo"))
      
    2. Compile your jar and use --jars or --packages argument to spark submit:

    spark-submit --packages ai.h2o:sparkling-water-core_2.11:2.0.4 your_jar.jar