Search code examples
apache-kafka

Searching messages in large kafka topic


I am planning to write a blog around searching message(s) based on criteria. I feel there is a lack of tooling / framework in this space, while it's a routine activity for any Kafka operation team / Development team.

The first option that I've looked into in UI. The most of the UI based kafka tools can't search well for a large topics, or at least whatever I've seen.

Then if we can go to cli based tools like kcat or kafka-*-consumer, they can scale to certain extend however they lack from extensive search capabilities.

These lead me to start looking into working with kafka connectors with adding filter SMT or may be using KSQL. Or write a fully native development in one's favourite language.

Of course we can dump messages into a bucket or something and search on top of this.

I've read Conduktor provides some capabilities to search using SQL, but not sure how good is that?

Question to community - what do you use for search messages in Kafka? Any one of the tools I've mentioned above.. or something better.


Solution

  • Short answer is that there's a lack of tooling because Kafka is not meant for "needle in a haystack" searches. There's no indicies, and you'll need to scan the whole topic (unless you can precompute the partition), then must deserialize any record before it can begin to be compared.

    The issue with any console tooling (kcat, console-consumer) is that they assume the records can be deserialized as text; you cannot grep binary data easily. Similarly, ksql / connectors only offer certain deserializers... That'll leave you writing your own tooling for any non-standard formats.

    Conduktor client has some JSONPath operator, I believe, yes. Again, that's limited by deserialization options. But it still must consume the whole topic.

    If you must have quick lookups, use an actual indexed database that you can dump Kafka data into. Elasticsearch/OpenSearch or object storage would be the most flexible option for searching any fields.