I'm using a Flume 1.7 Kafka source
to pull data out of Apache Kafka into my AbstractSink
. In the past I could re-start the offsets at the beginning of the topic by deleting the topic offsets using ./kafka-consumer-groups.sh --delete
but since Flume 1.7 (apparently) uses a "new" consumer, attempting ./kafka-consumer-groups.sh --delete
now gives the following error message:
Option [delete] is not valid with [new-consumer]. Note that there's no need to delete group metadata for the new consumer as it is automatically deleted when the last member leaves
So, what is the recommended method of achieving the desired behavior (which is that we will re-process the data from the beginning of the topic) ?
Here is part of my flume config:
myagent.sources.my-kafka-source.type = org.apache.flume.source.kafka.KafkaSource
myagent.sources.my-kafka-source.kafka.bootstrap.servers = kafka.example.net:9092
myagent.sources.my-kafka-source.kafka.consumer.group.id = my-gid
myagent.sources.my-kafka-source.kafka.topics = my.topic
myagent.sources.my-kafka-source.kafka.auto.offset.reset = earliest
myagent.sources.my-kafka-source.channels = my_channel
Flume does not offer direct support to the rewind feature although kafka does ships with KafkaConsumer#seek allowing you to re-consume the messages. Seems you have to use a new group id to do this which needs to restart the Flume agent.