I am trying to read data from a secured Kafka cluster using spark structured streaming. Also I am using the below library to read the data - "spark-sql-kafka-0-10_2.12":"3.0.0-preview" since it has the feature to specify our custom group id (instead of spark setting its own custom group id)
Dependency used in code:
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql-kafka-0-10_2.12</artifactId>
<version>3.0.0-preview</version>
I am getting the below error - even after specifying the required JAAS configuration in spark options.
Caused by: java.lang.IllegalArgumentException: requirement failed: Delegation token must exist for this connector. at scala.Predef$.require(Predef.scala:281) at org.apache.spark.kafka010.KafkaTokenUtil$.isConnectorUsingCurrentToken(KafkaTokenUtil.scala:299) at org.apache.spark.sql.kafka010.KafkaDataConsumer.getOrRetrieveConsumer(KafkaDataConsumer.scala:533) at org.apache.spark.sql.kafka010.KafkaDataConsumer.$anonfun$get$1(KafkaDataConsumer.scala:275)
Following document specifies that we can disable the feature of obtaining delegation token - https://spark.apache.org/docs/3.0.0-preview/structured-streaming-kafka-integration.html
I tried setting this property spark.security.credentials.kafka.enabled
to false
in spark config, but it is still failing with the same error.
Apparently there seems to be a bug on the preview release and has been fixed on the GA Spark 3.x release.
Reference : https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-30495
Now, we can specify our custom consumer group name while fetching the data from Kafka (Even though it's not recommended and we will see a warning message while specifying it).