I have always read that Kafka only assigns one consumer instance (thread) to a single partition. I recently came across ConcurrentKafkaListenerContainerFactory and was reading on it, and some aspects seemed contradictory to what I Knew before.
Let's say we have one consumer instance (pod/machine) and within it I use ConcurrentKafkaListenerContainerFactory with concurrency level set to three 3 . I have a Kafka Topic with three partitions. With concurrency level set to three I understand my @KafkaListener method will read messages from all three partitions of the topic .
Does this mean that even with one consumer instance we are able to consume messages from three partitions in parallel.
Can this be used as a substitute to increase rate of consumption from a Kafka topic without increasing the number of partitions. The only other way I knew was to create a thread pool and submit the received messages to it . Can we use ConcurrentKafkaListenerContainerFactory as a replacement for this process as this definitely sounds much easier to implement .
The ConcurrentKafkaListenerContainer
creates the number of KafkaConsumer
instances according your configuration . That’s how you can read different partitions in parallel .
Don’t mix a KafkaConsumer
object with a general consumer entity definition. In other words your consumer application is definitely not a Kafka Consumer object by Apache Kafka definition. This is not a first time though I hear that people call their application as “Kafka Consumer”… what doc did you read that makes you think so ?
See more info about concurrency in Spring for Apache Kafka in its docs: https://docs.spring.io/spring-kafka/docs/current/reference/html/#message-listener-container