Search code examples
javaapache-sparkcassandranoclassdeffounderrorspark-cassandra-connector

NoClassDefFoundError: org/apache/spark/sql/DataFrame in spark-cassandra-connector


I'm trying to upgrade spark-cassandra-connector from 1.4 to 1.5.

Everything seems fine but when I run test cases then It stuck between the process and log some error message saying:

Exception in thread "dag-scheduler-event-loop" java.lang.NoClassDefFoundError: org/apache/spark/sql/DataFrame

My pom file looks like:

<dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
    <!-- https://mvnrepository.com/artifact/com.datastax.spark/spark-cassandra-connector_2.10 -->
<dependency>
    <groupId>com.datastax.spark</groupId>
    <artifactId>spark-cassandra-connector_2.10</artifactId>
    <version>1.5.0</version>
    </dependency>
        <dependency>
        <groupId>com.google.guava</groupId>
        <artifactId>guava</artifactId>
        <version>16.0.1</version>
    </dependency>
    <!-- Scala Library -->
    <dependency>
        <groupId>org.scala-lang</groupId>
        <artifactId>scala-library</artifactId>
        <version>2.10.5</version>
    </dependency>
    <!--Spark Cassandra Connector-->
    <dependency>
        <groupId>com.datastax.spark</groupId>
        <artifactId>spark-cassandra-connector_2.10</artifactId>
        <version>1.5.0</version>
    </dependency>
    <dependency>
        <groupId>com.datastax.spark</groupId>
        <artifactId>spark-cassandra-connector-java_2.10</artifactId>
        <version>1.5.0</version>
    </dependency>
    <dependency>
      <groupId>com.datastax.cassandra</groupId>
      <artifactId>cassandra-driver-core</artifactId>
      <version>3.0.2</version>
    </dependency>
    <!--Spark-->
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.10</artifactId>
        <version>1.5.0</version>
        <exclusions>
          <exclusion>
            <groupId>net.java.dev.jets3t</groupId>
            <artifactId>jets3t</artifactId>
          </exclusion>
        </exclusions>
    </dependency>
  </dependencies>
</project>

Thank you in advance!!

Can anyone please help me with this ? If you need more info please let me know!!


Solution

  • Try to add dependency

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.10</artifactId>
        <version>${spark.version}</version>
        <scope>provided</scope>
    </dependency>
    

    Also make sure that your version spark-cassandra-connector is compatible with version of Spark you're using. I had the same error message even with all proper dependencies when was trying to use older spark-cassandra-connector with newer Spark version. Refer to this table: https://github.com/datastax/spark-cassandra-connector#version-compatibility