Search code examples
apache-kafkakafka-consumer-apidistributed-computingkafka-python

How to make a consumer leave and enter a consumer group in kafka


So, I have a consumer group and I have a few distinct nodes, each acting as a consumer. Each node is supposed to perform some computation intensive task. I want to make a consumer join to this consumer group only when it has available CPU resources. Once it has joined, it will consume a message from the topic regarding what computation it needs to perform and then start the computation. Now as this consumer is engaged in a computational task, I want to make it exit from the consumer group as it doesn't have any further capability to perform new computations. Is this possible to do in kafka? Or maybe there is another better way to do the above thing? I am using the kafka-python library.


Solution

  • In general, regardless of the Kafka client, this is possible to do using any Kafka consumer. The way to do it is simply subscribe to the topic, consume the message you want to process, acknowledge only that specific message, and close the consumer.

    Specifically in the Kafka python client, the method you want is KafkaConsumer.close. Make sure to set auto-commit to false, because your poll might have consumed more than the messages you want to compute, and you only want to acknowledge the one you're actually going to work on.

    Alternatively, you can set your consumer properties (specifically max.poll.records) to fetch only 1 message per poll, and then you can use the .close method with auto-commit set to true.

    More info on all the KafkaConsumer configuration options here: https://kafka.apache.org/documentation/#consumerconfigs

    And here's a link to the official kafka-python client KafkaConsumer docs:

    https://kafka-python.readthedocs.io/en/master/apidoc/KafkaConsumer.html#kafka.KafkaConsumer.close