Search code examples
azureapache-sparkpysparkazure-eventhubdatabricks

Consume events from EventHub In Azure Databricks using pySpark


I could see spark connectors & guidelines for consuming events from Event Hub using Scala in Azure Databricks.

But, How can we consume events in event Hub from azure databricks using pySpark?

any suggestions/documentation details would help. thanks


Solution

  • Below is the snippet for reading events from event hub from pyspark on azure data-bricks.

    // With an entity path 
    val with = "Endpoint=sb://SAMPLE;SharedAccessKeyName=KEY_NAME;SharedAccessKey=KEY;EntityPath=EVENTHUB_NAME"
    
    
    # Source with default settings
    connectionString = "Valid EventHubs connection string."
    ehConf = {
      'eventhubs.connectionString' : connectionString
    }
    
    df = spark \
      .readStream \
      .format("eventhubs") \
      .options(**ehConf) \
      .load()
    
    readInStreamBody = df.withColumn("body", df["body"].cast("string"))
    display(readInStreamBody)