Search code examples
apache-kafkaapache-kafka-mirrormaker

kafka mirrormaker 2 custom partitioner


I am trying to determine if it is possible to use a custom partitioner with mirrormaker 2 so when replicating to the target cluster my custom partitioner is used. According to the docs here https://github.com/apache/kafka/tree/trunk/connect/mirror it should be possible to override the mm2 producer settings using a config format target-alias.producer.*, and I've tried various formats such as (where source and target are my cluster aliases)

target.partitioner.class=com.my.custom.Partitioner
target.producer.partitioner.class=com.my.custom.Partitioner
target.cluster.producer.partitioner.class=com.my.custom.Partitioner
source->target.partitioner.class=com.my.custom.Partitioner
source->target.producer.partitioner.class=com.my.custom.Partitioner

I can see in the mirrormaker 2 logs that the partitioner is loaded successfully, and where the producer config is dumped out, it appears the custom partitioner class is set. However according to the debug logs in my partitioner class (and observation) it is only being invoked when mirrormarker 2 is producing to the internal topics such as mm2-offsets.source.internal and not when producing to the actual topics being replicated to.

Can anyone help me understand the behaviour above ? Im assuming there are seperate producer clients, some responsible for writing replicated messages, and some responsible for updating the internal topics, but if so, not sure why the custom partitioner would only work for the later.


Solution

  • MirrorMaker 2 is a source connector. It fetches records from an external system, in this case another Kafka cluster, and gives them to the Kafka Connect runtime that produces them to the target cluster.

    So as you noticed MirrorMaker 2 does not have producers writing mirrored records.

    MirrorMaker 2 ensures mirrored records keep the same partition on the target topics than on the source topics. To do so, it creates topics in the target cluster that have the same number of partitions as source topics. This is by design as it is the main use case of a mirroring tool.

    It's not clear what your use case is. You could maybe change records' partitions using a custom Single Message Transformation. Otherwise, once mirrored, use Kafka Streams to repartition your topics.