Reading from Azure Event hub with Kafka driver doesn't seem to get any data

I'm running the following code in an Azure Databricks python notebook:

TOPIC = "myeventhub"
EH_SASL = " required username=\"$ConnectionString\" password=\"Endpoint=sb://;SharedAccessKeyName=MyKeyName;SharedAccessKey=myaccesskey;\";"

df = spark.readStream \
    .format("kafka") \
    .option("subscribe", TOPIC) \
    .option("kafka.bootstrap.servers", BOOTSTRAP_SERVERS) \
    .option("kafka.sasl.mechanism", "PLAIN") \
    .option("", "SASL_SSL") \
    .option("kafka.sasl.jaas.config", EH_SASL) \
    .option("", "60000") \
    .option("", "60000") \
    .option("failOnDataLoss", "false") \
    .option("startingOffsets", "earliest") \

df_write = df.writeStream \
    .outputMode("append") \
    .format("console") \
    .start() \

This shows no output in the notebook. How could I debug what the problem is?


  • If you use .format("console") then output won't be in the notebook, it will be in the driver & executor logs - it's a difference between Spark and Databricks.

    If you want to see the data, just use the display function:
