Search code examples
scalaapache-sparksbtgraphframes

SBT in Apache-Spark graph frames


I have following SBT file, I am compiling the Scala Code using Apache GraphFrame and also reading the CSV file.

name := "Simple"

version := "1.0"

scalaVersion := "2.10.5"

libraryDependencies ++= Seq(

"org.apache.spark" %% "spark-core" % "1.6.1",

"graphframes" % "graphframes" % "0.2.0-spark1.6-s_2.10",

"org.apache.spark" %% "spark-sql" % "1.0.0",

"com.databricks" % "spark-csv" % "1.0.3"
)

Here is My Code in Scala

import org.graphframes._
import org.apache.spark.sql.DataFrame
    val nodesList = sqlContext.read.format("com.databricks.spark.csv").option("header", "true").option("inferSchema", "true").load("/Users/Desktop/GraphFrame/NodesList.csv")
    val edgesList= sqlContext.read.format("com.databricks.spark.csv").option("header", "true").option("inferSchema", "true").load("/Users/Desktop/GraphFrame/EdgesList.csv")
    val v=nodesList.toDF("id", "name")
    val e=edgesList.toDF("src", "dst", "dist")
    val g = GraphFrame(v, e)

When I try to make the Jar file using SBT, It gives me following error during compiling

[trace] Stack trace suppressed: run last *:update for the full output.
[error] (*:update) sbt.ResolveException: unresolved dependency: graphframes#graphframes;0.2.0-spark1.6-s_2.10: not found
[error] Total time: 

Solution

  • GraphFrames are not in Maven Central repository yet.

    You can:

    1. download artifact on Spark Packages page and install into local repository
    2. Add Spark Packages repository into your SBT build.sbt:

    Code in build.sbt:

    resolvers += Resolver.url("SparkPackages", url("https://dl.bintray.com/spark-packages/maven/"))