Search code examples
pythonazureazure-eventhub

What is the partition in eventhub?


i have the followin code, to read messages from eventhub. But it seems, that i do not get all messages. The partition count is 4, but i am just using 1 reader.

Is partition 0 right? or do i have to add 4 receivers for every partiton?

#!/usr/bin/python3
import os
import sys
import logging
import time
from azure.eventhub import EventHubClient, Receiver, Offset

logger = logging.getLogger("azure")

ADDRESS = "amqps://xxxxx.servicebus.windows.net/insights-operational-logs"

CONSUMER_GROUP = "myConsumerGrp"
OFFSET = Offset("-1")
PARTITION = "0"

total = 0
last_sn = -1
last_offset = "-1"

client = EventHubClient(ADDRESS, debug=False, username="xxxx", password="xxxx")
try:
    receiver = client.add_receiver(CONSUMER_GROUP, PARTITION, prefetch=5000, offset=OFFSET)
    client.run()
    start_time = time.time()
    for event_data in receiver.receive(timeout=100):
        last_offset = event_data.offset
        last_sn = event_data.sequence_number
        print("Received: offset {}, sn {}, (BODY LEN: {})".format(last_offset.value, last_sn, len(event_data.body_as_str())))
        print("    Body: {}".format(event_data.body_as_str()))
        print("")
        total += 1

    end_time = time.time()
    client.stop()
    run_time = end_time - start_time
    print("Received {} messages in {} seconds".format(total, run_time))

except KeyboardInterrupt:
    pass
finally:
    client.stop()


Solution

  • Partitioning is the way Event Hub can distribute load. So if you provisioned the Event Hub with 4 partitions the default behavior is that the messages send to it are round robin distributed across the provisioned partitions.

    You will need to use 4 readers or have one (or more) regularly read from another partition to process all messages.

    More information is found in the docs