Search code examples
apache-kafkaapache-kafka-connectaws-mskapache-kafka-mirrormaker

Kafka Mirror to Apache Kafka from AWS MKS Cluster


I'm confused about the architecture of mirroring a topic among an AWS MKS cluster (source) to another Apache Kafka cluster (target).

The source cluster also uses AWS Glue Schema registry, so I need to have that Avro topic decrypted as JSON topic in my target cluster.

As for the operation, I'm using confluentic cp-kafka-connect image, which I have rebuilt including AWS jars (aws-msk-iam-auth-1.1.9-all.jar, schema-registry-serde-1.1.16.jar, schema-registry-kafkaconnect-converter-1.1.16.jar). These jars are added to classpath /usr/share/java/kafka and can be leveraged from the kafka binaries.

My main question is which binary is proper for the operation and which connectors should be used as well (mm2, source, sink), so the deserialization also happens at target cluster.

  • connect-mirror-maker (2.0)
  • kafka-mirror-maker (legacy)
  • connect-distributed
  • connect-standalone

P.S. A configuration example would be great.


Solution

  • You should always use ByteArrayConverter when mirroring, so that no deserialization issues can occur and the bytes are unmodified in flight

    MirrorMaker2 is built in to Kafka. You don't need specific Confluent libraries, or Docker images, but the one you've mentioned automatically runs connect-distributed

    Related docs (reversed direction) https://docs.aws.amazon.com/prescriptive-guidance/latest/patterns/migrate-an-on-premises-apache-kafka-cluster-to-amazon-msk-by-using-mirrormaker.html