I have a topic keyed by byte[], I want to repartition it and process the topic by another key in a field in the message body.
I find there is KGroupedStream
and groupby
function. But it asks for an aggregation function to convert to a KTable/KStream. I don't need an aggregate. I just want to repartition and process the output.
(Kafka Streams 2.5.x or older)
Not sure if this is entirely kosher, but it works and the repartition topic is created automatically and with the right number of partitions wrt the stream
.
KTable emptyTable = someTable.filter((k, v) -> false);
KStream stream = ...
KStream repartionedStream = stream.selectKey(...)
.leftJoin(emptyTable, (v, Null) -> v, ...);
Edit
This approach apparently became a complicated abomination deserving of an avalanche of downvotes and a flogging in Aug 2020 when Kafka Streams 2.6.0 was introduced and KStream.repartition() came into existence.
So for streams version 2.6.x+ you must use
KStream stream = ...
KStream repartionedStream = stream.selectKey(...)
.repartition();