Search code examples
apache-kafkaapache-storm

What is the difference between a "Kafka Spout" and a "kafka Consumer"?


both "Kafka Spout" and "Kafka Consumer" do retrieve data from the Kafka Brokers, the spout so far i know is for communicating with Storm, and the Consumer is with whatever else.

-but still, what is the difference technically?

-or what would be the difference between If i pulled out the data using a Consumer then receive it using a "Storm Spout" and between if i just used a "Kafka Spout" then add it to my Storm Topology Builder's setSpout(); function

-and when to use Consumer, or a Kafka Spout


Solution

  • A/the "Kafka Spout" is a Storm-specific adapter to read data from Kafka into a Storm topology. Behind the scenes, the Kafka spout actually uses Kafka's built-in "Kafka consumer" client.

    Technically, the difference is that the Kafka spout is a kind of a Storm-aware "wrapper" on top of Kafka's consumer client.

    In Storm, you should typically always use the included Kafka spout (see https://github.com/apache/storm/tree/master/external/storm-kafka or, for a spout implementation that uses Kafka's so-called "new" consumer client, https://github.com/apache/storm/tree/master/external/storm-kafka-client). It would be a very rare case to implement your own -- perhaps the most likely case would be if there is a bug in the existing Kafka spout that you need to work around until the Storm project fixes the bug upstream.