Search code examples
javaapache-kafkastreamapache-storm

Kafka Spout read twice message on Storm Topology


I'm trying to simulate stream traffic using Kafka to Storm. I used KafkaSpout to read a message from one topic sent by a Producer that read these Tweets and send them to a topic. My problem is that after topology consumes all tweet send in this topic it continues to read the message in the topic twice. How can I stop KafkaSpout from reading twice?(replication factor is set to 1)


Solution

  • The configuration looks fine to me.

    Maybe the issue is double acking. Make sure you're only acking each tuple once in execute.

    As mentioned in a comment, please consider upgrading to a newer Kafka version, as well as switching to storm-kafka-client.

    Also something that may make your life a little easier: Consider extending BaseBasicBolt instead of BaseRichBolt. BaseBasicBolt automatically acks the tuple for you if running execute doesn't throw an error. If you want to fail a tuple you can throw FailedException. BaseRichBolt should only be used if you want to do more complicated acking, e.g. aggregating tuples from many execute invocations in-memory before acking.