Search code examples
c#azureazure-eventhubevent-driven-design

Azure event hub - How to consume events parallelly using the official SDK?


I've setup the following test:

  • Craeted a azure event hub with 10 partitions
  • Created a single storage account
  • Created a single consumer group
  • Filled the hub with 10k messages
  • Created 2 containers (on AKS) that would basically consume those events (using the same consumer group) and log them azure application insights.

Expectation:
Run

traces
| where message == "Event received"
| summarize count() by bin(timestamp,1s), cloud_RoleInstance
| render timechart 

and see something like:
enter image description here

but instead im seeing this:
enter image description here

(this is a 3x run for 10k events each, to eliminate the "pod not warmed up variable")

Note that there's no (or very little) overlap between the pods activity, as if one of them is holding a lock or something, and mysteriously, at some point, the lock is released and used by the other pod.

Relevant Consumer code:

protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
    _processor = new EventProcessorClient(_storageClient, _consumerGroup, _hubConnection, _eventHubName);
    _processor.ProcessEventAsync += ProcessEventHandler;
    _processor.ProcessErrorAsync += ProcessErrorHandler;

    // Start the processing

    await _processor.StartProcessingAsync(stoppingToken);
}

internal async Task ProcessEventHandler(ProcessEventArgs eventArgs)
{
    _logger.LogTelemetry("Event received");

    await eventArgs.UpdateCheckpointAsync(eventArgs.CancellationToken);
}

Solution

  • There is actually nothing wrong with the code above. On this GitHub issue we discussed a bit and was able to notice the expected behavior when dealing with larger batches (500k events).

    Here's a screenshot:
    enter image description here