Search code examples
apache-sparkvirtual-machinegoogle-cloud-dataprocspark-cloudant

What is the solution for the error, “JBlas is not a member of package or apache”?


I tried to solve it from both of these ( this and this) threads, and it worked for me on my own virtual machine but didn’t work in cloud dataproc. I did the same process for both of them. But there is still error in the cloud which is same as the error previously in a virtual machine. What should be done on the cloud to solve it?screenshot of the error


Solution

  • Did you do the full "git clone" steps in those linked threads? And did you need to actually modify jblas? If not, you should just pull them from maven central using --packages org.jblas:jblas:1.2.4 without the git clone or mvn install; the following worked fine for me on a new Dataproc cluster:

    $ spark-shell --packages org.jblas:jblas:1.2.4
    Ivy Default Cache set to: /home/dhuo/.ivy2/cache
    The jars for the packages stored in: /home/dhuo/.ivy2/jars
    :: loading settings :: url = jar:file:/usr/lib/spark/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
    org.jblas#jblas added as a dependency
    :: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0
      confs: [default]
      found org.jblas#jblas;1.2.4 in central
    downloading https://repo1.maven.org/maven2/org/jblas/jblas/1.2.4/jblas-1.2.4.jar ...
      [SUCCESSFUL ] org.jblas#jblas;1.2.4!jblas.jar (605ms)
    :: resolution report :: resolve 713ms :: artifacts dl 608ms
      :: modules in use:
      org.jblas#jblas;1.2.4 from central in [default]
      ---------------------------------------------------------------------
      |                  |            modules            ||   artifacts   |
      |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
      ---------------------------------------------------------------------
      |      default     |   1   |   1   |   1   |   0   ||   1   |   1   |
      ---------------------------------------------------------------------
    :: retrieving :: org.apache.spark#spark-submit-parent
      confs: [default]
      1 artifacts copied, 0 already retrieved (10360kB/29ms)
    Setting default log level to "WARN".
    To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
    ivysettings.xml file not found in HIVE_HOME or HIVE_CONF_DIR,/etc/hive/conf.dist/ivysettings.xml will be used
    Spark context Web UI available at http://10.240.2.221:4040
    Spark context available as 'sc' (master = yarn, app id = application_1501548510890_0005).
    Spark session available as 'spark'.
    Welcome to
          ____              __
         / __/__  ___ _____/ /__
        _\ \/ _ \/ _ `/ __/  '_/
       /___/ .__/\_,_/_/ /_/\_\   version 2.2.0
          /_/
    
    Using Scala version 2.11.8 (OpenJDK 64-Bit Server VM, Java 1.8.0_131)
    Type in expressions to have them evaluated.
    Type :help for more information.
    
    scala> import org.jblas.DoubleMatrix
    import org.jblas.DoubleMatrix
    
    scala> :quit
    

    Additionally, if you need to submit jobs that require "packages" via Dataproc's job submission API, then since --packages is actually syntactic sugar in the various Spark launcher scripts rather than being a property of a Spark job, you need to use the equivalent spark.jars.packages instead in such a case, as explained in this StackOverflow answer.