Search code examples
javamavenapache-sparkspark-streamingapache-kafka-streams

Why does Spark application fail with "Exception in thread "main" java.lang.NoClassDefFoundError: ...StringDeserializer"?


I am developing a Spark application that listens to a Kafka stream using Spark and Java.

I use kafka_2.10-0.10.2.1.

I have set various parameters for Kafka properties: bootstrap.servers, key.deserializer, value.deserializer, etc.

My application compiles fine, but when I submit it, it fails with the following error:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/kafka/common/serialization/StringDeserializer

I do use StringDeserializer for key.deserializer and value.deserializer so it's indeed related to how I wrote my application.

Various maven dependencies used in pom.xml:

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-core_2.11</artifactId>
        <version>2.1.1</version>
    </dependency>
    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming_2.10</artifactId>
        <version>2.1.1</version>
    </dependency>

    <dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-streaming-kafka-0-10_2.11</artifactId>
        <version>2.1.1</version>
    </dependency>

I have tried updating the version of spark streaming/kafka. I could not find much anywhere.


Solution

  • spark-streaming_2.10

    This is dependent upon Scala 2.10

    Your other dependencies are using Scala 2.11

    Upgrading the version is the correct solution for the current error.

    And make sure that within streaming-kafka-0-10, this matches the version of Kafka you're running

    Application is compiling fine but when I am trying to submit the spark job, its showing error: Exception in thread "main" java.lang.NoClassDefFoundError:

    By default, maven does not include dependency jars when it builds a target