Search code examples
apache-sparkapache-kafkaspark-streamingspark-streaming-kafka

Error: spark Streaming with kafka stream package does not work in spark-shell


I am trying use spark streaming to read from a kafka stream using spark-shell.

I have spark 3.0.1, so I am loading spark-shell with:

spark-shell --packages "org.apache.spark:spark-sql-kafka-0-10_2.12:3.0.1"

However, I receive the following error:

WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/Users/username/usr/spark-3.0.1-bin-hadoop2.7/jars/spark-unsafe_2.12-3.0.1.jar) to constructor java.nio.DirectByteBuffer(long,int)
WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.Platform
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
Ivy Default Cache set to: /Users/username/.ivy2/cache
The jars for the packages stored in: /Users/username/.ivy2/jars
:: loading settings :: url = jar:file:/Users/username/usr/spark-3.0.1-bin-hadoop2.7/jars/ivy-2.4.0.jar!/org/apache/ivy/core/settings/ivysettings.xml
org.apache.spark#spark-sql-kafka-0-10_2.12 added as a dependency
:: resolving dependencies :: org.apache.spark#spark-submit-parent-9b21d110-fcf8-4ec3-b4a5-9600d618aa83;1.0
        confs: [default]
        found org.apache.spark#spark-sql-kafka-0-10_2.12;3.0.1 in central
        found org.apache.spark#spark-token-provider-kafka-0-10_2.12;3.0.1 in central
        found org.apache.kafka#kafka-clients;2.4.1 in central
        found com.github.luben#zstd-jni;1.4.4-3 in central
        found org.lz4#lz4-java;1.7.1 in central
        found org.xerial.snappy#snappy-java;1.1.7.5 in central
        found org.slf4j#slf4j-api;1.7.30 in local-m2-cache
        found org.spark-project.spark#unused;1.0.0 in local-m2-cache
        found org.apache.commons#commons-pool2;2.6.2 in central
:: resolution report :: resolve 405ms :: artifacts dl 11ms
        :: modules in use:
        com.github.luben#zstd-jni;1.4.4-3 from central in [default]
        org.apache.commons#commons-pool2;2.6.2 from central in [default]
        org.apache.kafka#kafka-clients;2.4.1 from central in [default]
        org.apache.spark#spark-sql-kafka-0-10_2.12;3.0.1 from central in [default]
        org.apache.spark#spark-token-provider-kafka-0-10_2.12;3.0.1 from central in [default]
        org.lz4#lz4-java;1.7.1 from central in [default]
        org.slf4j#slf4j-api;1.7.30 from local-m2-cache in [default]
        org.spark-project.spark#unused;1.0.0 from local-m2-cache in [default]
        org.xerial.snappy#snappy-java;1.1.7.5 from central in [default]
        ---------------------------------------------------------------------
        |                  |            modules            ||   artifacts   |
        |       conf       | number| search|dwnlded|evicted|| number|dwnlded|
        ---------------------------------------------------------------------
        |      default     |   9   |   0   |   0   |   0   ||   9   |   0   |
        ---------------------------------------------------------------------

:: problems summary ::
:::: WARNINGS
                [NOT FOUND  ] org.slf4j#slf4j-api;1.7.30!slf4j-api.jar (3ms)

        ==== local-m2-cache: tried

          file:/Users/username/.m2/repository/org/slf4j/slf4j-api/1.7.30/slf4j-api-1.7.30.jar

                ::::::::::::::::::::::::::::::::::::::::::::::

                ::              FAILED DOWNLOADS            ::

                :: ^ see resolution messages for details  ^ ::

                ::::::::::::::::::::::::::::::::::::::::::::::

                :: org.slf4j#slf4j-api;1.7.30!slf4j-api.jar

                ::::::::::::::::::::::::::::::::::::::::::::::



:: USE VERBOSE OR DEBUG MESSAGE LEVEL FOR MORE DETAILS
Exception in thread "main" java.lang.RuntimeException: [download failed: org.slf4j#slf4j-api;1.7.30!slf4j-api.jar]
        at org.apache.spark.deploy.SparkSubmitUtils$.resolveMavenCoordinates(SparkSubmit.scala:1389)
        at org.apache.spark.deploy.DependencyUtils$.resolveMavenDependencies(DependencyUtils.scala:54)
        at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:308)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:871)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1007)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1016)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

I have also tried downloading the jar file from https://mvnrepository.com/artifact/org.apache.spark/spark-sql-kafka-0-10_2.12/3.0.1 and put it in /sparkdir/jars but without any success.

Am I doing something wrong to start the spark-shell? Which is the right way to configure this library?


Solution

  • clearing caches like ".ivy2/cache" "ivy2/jars" and ".m2/repository/" could fix your issue.