Search code examples
apache-sparkcassandradatastax-enterprise-graphdse-graph-frames

Can't initialize graph on Datastax using Spark


I'm trying to initialize my Datastax graph using Spark as follow :

val graphBuilder = spark.dseGraph("GRAPH_NAME")

but I have the following exception :

Exception in thread "main" java.lang.NoClassDefFoundError: com/datastax/bdp/graph/impl/element/vertex/id/AbstractVertexIdExternalImpl
at com.datastax.bdp.graph.spark.graphframe.DseGraphFrameBuilder$.apply(DseGraphFrameBuilder.scala:257)
at com.datastax.bdp.graph.spark.graphframe.SparkSessionFunctions.dseGraph(SparkSessionFunctions.scala:20)

I search a dataxtax jar that containscom/datastax/bdp/graph/impl but I didn't find it.

Any help is really appreciated. Thanks in Advance!


Solution

  • To use DseGraphFrames in your program running on external Spark, you need to link with so-called BYOS jar. This could be done as following (for Maven):

    <dependency>
      <groupId>com.datastax.dse</groupId>
      <artifactId>dse-byos_2.11</artifactId>
      <version>6.0.4</version>
    </dependency>
    

    And add DataStax repositories:

    <repositories>
        <repository>
          <id>DataStax-Repo</id>
          <url>https://repo.datastax.com/public-repos/</url>
        </repository>
    </repositories>
    

    And if you check the jar fetched from DataStax repository, it contains necessary file:

    unzip -l dse-byos_2.11-6.0.4.jar|grep AbstractVertexIdExternalImpl
         2839  10-06-2018 15:22   com/datastax/bdp/graph/impl/element/vertex/id/AbstractVertexIdExternalImpl.class