Search code examples
multithreadingapache-kafkareactive-programmingproject-reactorreactor-kafka

Using different threads to read from a consumer group in Kafka using reactor-kafka


I need to consume from a Kafka topic that will have millions of data. Once I read from the topic, i need to transform and write it to another topic. I am able to consume messages from the topic, process the data by multiple threads and write to another topic. I followed the example from here https://projectreactor.io/docs/kafka/1.3.5-SNAPSHOT/reference/index.html#concurrent-ordered

Here is my code:

public Flux<?> flux() {
            KafkaSender<Integer, Person> sender = sender(senderOptions());
            return KafkaReceiver.create(receiverOptions(Collections.singleton(sourceTopic)))
                                .receive()
                                .map(m -> SenderRecord.create(transform(m.value()), m.receiverOffset()))
                                .as(sender::send)
                                .doOnNext(m -> m.correlationMetadata().acknowledge())
                                .doOnCancel(() -> close());
        }
            

I have multiple consumers to read from and was looking into adding different reader threads to read from the topic due to the volume of data. However, the reactor-kafka documentation mentions KafkaReceiver is not thread-safe since the underlying KafkaConsumer cannot be accessed concurrently by multiple threads.

I am looking for suggestions on reading from a topic concurrently.


Solution

  • So basically what you are looking for called Consumer Group, the maximum parallel consumption you can run is limited by the number of partitions your topic has.

    Kafka Consumer Group mechanism allows you to seperate the work of consumption a topic to diffrent "readers" which belongs to the same group, the work would be divided by that each consumer in the group would be solely responsible for a partition (1 or more, based on number of consumers in the group, and number of partitions to the topic)