So when I look for a way to count the messages in a topic, this one is good
kafka-run-class kafka.tools.GetOffsetShell --broker-list broker1:9092,broker2:9092,broker3:9092 --topic rev-dly-upd --time -1
The only thing is, when I change the retention.ms
config to retention.ms=1000
, and even check that the topic has been configured by running kafka-topics --describe --zookeeper zookeeper1:2181 --topic rev-dly-upd
. I can see clearly that that config is set at 1000...
Topic:rev-dly-upd PartitionCount:8 ReplicationFactor:3 Configs:retention.ms=1000
Topic: rev-dly-upd Partition: 0 Leader: 159 Replicas: 159,96,160 Isr: 159,96,160
Topic: rev-dly-upd Partition: 1 Leader: 160 Replicas: 160,159,94 Isr: 94,160,159
Topic: rev-dly-upd Partition: 2 Leader: 94 Replicas: 94,160,95 Isr: 95,94,160
Topic: rev-dly-upd Partition: 3 Leader: 95 Replicas: 95,94,96 Isr: 95,96,94
Topic: rev-dly-upd Partition: 4 Leader: 96 Replicas: 96,95,159 Isr: 95,96,159
Topic: rev-dly-upd Partition: 5 Leader: 159 Replicas: 159,160,94 Isr: 159,94,160
Topic: rev-dly-upd Partition: 6 Leader: 160 Replicas: 160,94,95 Isr: 94,160,95
Topic: rev-dly-upd Partition: 7 Leader: 94 Replicas: 94,95,96 Isr: 95,96,94
yet when I run kafka-run-class kafka.tools.GetOffsetShell --broker-list broker1:9092,broker2:9092,broker3:9092 --topic rev-dly-upd --time -1
all I always get records returned. What could the reasons be?
Basically I had to stop using kafka-run-class kafka.tools.GetOffsetShell
to count the messages in a topic. If you google "how to count messages in kafka topic", a lot of posts and things will lead you to think that the above command, given the right arguments, will give you a count of total messages. However if you have purged messages during the lifespan of the topic, then it will not give you an accurate count. You just have to do something like open a console consumer, output to text file, and then read the lines of that file with old-fashioned wc -l
.