Search code examples
apache-sparkpysparkh2odatabrickssparkling-water

not able to create H2OContext in Databricks- using pysparkling


I am not able to create H2OContext in Spark Databricks- using pysparkling. It is giving the following error.

Code:from pysparkling import *
Code:import h2o
Code:h2oConf = H2OConf(spark)
Code:h2oConf.set("spark.ui.enabled", True)

Out[2]: Sparkling Water configuration: backend cluster mode : internal workers : None cloudName : Not set yet, it will be set automatically before starting H2OContext. flatfile : true clientBasePort : 54321 nodeBasePort : 54321 cloudTimeout : 60000 h2oNodeLog : INFO h2oClientLog : INFO nthreads : -1 drddMulFactor : 10

Code: h2oContext = H2OContext.getOrCreate(spark, h2oConf)
Error: java.lang.NoSuchFieldError: quasibinomial

Here is all the details of the cluster:
1. Cluster:
Runtime version: Spark 2.1 (Auto updating, Scala 2.11) Type: Standard Workers: 4

  1. Libraries attached with above cluster: h2o_pysparkling_2.1,
    h2o-genmodel.jar

Solution

  • Found the issue. I was using Spark 2.1 (Auto updating, Scala 2.11) cluster. But I should use cluster Spark 2.1.X-dbx (you must use a Spark 2.1 version and Scala 2.11) when working with H2O Sparkling water.