Search code examples
apache-kafkaspring-kafkaspring-cloud-streamspring-cloud-stream-binder-kafka

Strategy for maximum throughput having 6 Kafka Consumers when the processing of each message requires a long time


Consider this scenario: Kafka topic with 6 partitions. Spring Java Kafka Consumer Application with 6 replicas so that each of them deals with one of the partitions.

The problem I'm facing is the processing of each message in the consumer takes a long time (~20 seconds), since it needs to call a really slow external system.

So even though I've provisioned 6 partitions/replicas I end up having a bottleneck in which the 6 consumers block ~20 seconds per message, which means a throughput of 6 messages every 20 seconds!!

Could you suggest ways to speed up this scenario taking into account that I can't modify the behaviour of the external system?


Solution

  • Increase the number of partitions and concurrency on each instance.

    The number of partitions must be >= instances * concurrency.