I am trying to understand kafka's consumer groups notion.
All the documentation makes it look as if a consumer group is used to parallelize reads within a particular topic. But I've also read that consumers can subscribe to multiple topics.
So my question is -- what is the relationship between consumer groups and topics? can 2 different consumers that belong to the same consumer group read from different topics?
All the documentation makes it look as if a consumer group is used to parallelize reads within a particular topic. But I've also read that consumers can subscribe to multiple topics.
Yes and yes.
Kafka sends every published message to every consumer group, but only one consumer in each group. If you want queue like behavior where every message is received/processed by exactly one consumer, then use one consumer group per topic. If you want pub/sub behavior where every consumer gets one copy of every message, then every consumer should use a distinct consumer group.
So my question is -- what is the relationship between consumer groups and topics? can 2 different consumers that belong to the same consumer group read from different topics?
It's best to think: do you want queue behavior (1 msg -> 1 and only one consumer)? Or pub-sub behavior (1 msg -> every consumer)? Or some hybrid of the two? Once you have chosen that, lay out consumer groups accordingly.
To directly answer the question: You generally can organize any way that you want. The most logical way to think of it is each topic has N consumer groups. You can use the same consumer group string on different topics, but I don't think that holds any significance.