Search code examples
apache-sparkapache-kafkaspark-structured-streaming

How to disable 'spark.security.credentials.${service}.enabled' in Structured streaming while connecting to a kafka cluster


I am trying to read data from a secured Kafka cluster using spark structured streaming. Also I am using the below library to read the data - "spark-sql-kafka-0-10_2.12":"3.0.0-preview" since it has the feature to specify our custom group id (instead of spark setting its own custom group id)

Dependency used in code:

        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql-kafka-0-10_2.12</artifactId>
        <version>3.0.0-preview</version>

I am getting the below error - even after specifying the required JAAS configuration in spark options.

Caused by: java.lang.IllegalArgumentException: requirement failed: Delegation token must exist for this connector. at scala.Predef$.require(Predef.scala:281) at org.apache.spark.kafka010.KafkaTokenUtil$.isConnectorUsingCurrentToken(KafkaTokenUtil.scala:299) at org.apache.spark.sql.kafka010.KafkaDataConsumer.getOrRetrieveConsumer(KafkaDataConsumer.scala:533) at org.apache.spark.sql.kafka010.KafkaDataConsumer.$anonfun$get$1(KafkaDataConsumer.scala:275)

Following document specifies that we can disable the feature of obtaining delegation token - https://spark.apache.org/docs/3.0.0-preview/structured-streaming-kafka-integration.html

I tried setting this property spark.security.credentials.kafka.enabled to false in spark config, but it is still failing with the same error.


Solution

  • Apparently there seems to be a bug on the preview release and has been fixed on the GA Spark 3.x release.

    Reference : https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-30495

    Now, we can specify our custom consumer group name while fetching the data from Kafka (Even though it's not recommended and we will see a warning message while specifying it).