Search code examples
apache-sparkapache-spark-mllibapache-spark-ml

Spark does not load ARPACK or BLAS from netlib


I am computing the SVD for my data. But whenever I submit the Spark application using spark-submit the log-file states:

WARN ARPACK/BLAS: Failed to load implementation from: com.github.fommil.netlib.NativeSystemARPACK/BLAS

I built Spark using with -Pnetlib-lgpl flag, and also include the netlib dependency in my sbt file when creating the jar file:

libraryDependencies ++= Seq(
    "org.apache.spark"  % "spark-core_2.10"              % "1.5.0" % "provided",
    "org.apache.spark"  % "spark-mllib_2.10"             % "1.3.0",
    "com.github.fommil.netlib"  %   "all"   %   "1.1.2"     pomOnly()
)

GCC and Gfortran versions are gcc version 4.8.0 (GCC). I also installed BLAS, LAPACK and ATLAS and followed the instructions on the netlib site: https://github.com/fommil/netlib-java

In spark-shell when I import as import com.github.fommil.netlib._, no indication is given that it was not imported.

I have tried to debug this problem for a while now and I am out of ideas. Some one kindly help me figure this out.


Solution

  • It's a known pain point.

    I've successfully followed the instructions @ https://github.com/PasaLab/marlin/issues/1 to get this horrid thing to work in Spark 1.4.x / 1.5.x with Intel MKL

    I think there is roughly one place where you would have to tweak those instructions to link with ATALAS, but it should be doable.