Search code examples
kubernetesapache-kafkaapache-kafka-connectstrimzi

Question about connector plugin versions in Strimzi's Kafka Connect


I was playing around with Kafka Connect's functionality to build images, and I have some questions. I used Strimzi operator version 0.45.0 in all examples.

When I create a Strimzi Kafka Connect instance with no image specified by applying this configuration:

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnect
metadata:
  name: my-connect-cluster-connect-build
  namespace: first-kafka-namespace
labels:
  my-connect: my-connect-cluster-connect-build
annotations:
  strimzi.io/use-connector-resources: "true"
spec:
  version: 3.8.0
  replicas: 1
  bootstrapServers: "first-kafka-cluster-kafka-bootstrap:9092"
config:
  group.id: connect-cluster
  offset.storage.topic: connect-cluster-offsets
  config.storage.topic: connect-cluster-configs
  status.storage.topic: connect-cluster-status
  config.storage.replication.factor: 1
  offset.storage.replication.factor: 1
  status.storage.replication.factor: 1

Connectors that are available are these (by running curl -s http://localhost:8083/connector-plugins | jq . command):

[
  {
    "class": "org.apache.kafka.connect.mirror.MirrorCheckpointConnector",
    "type": "source",
    "version": "3.8.0"
  },
  {
    "class": "org.apache.kafka.connect.mirror.MirrorHeartbeatConnector",
    "type": "source",
    "version": "3.8.0"
  },
  {
    "class": "org.apache.kafka.connect.mirror.MirrorSourceConnector",
    "type": "source",
    "version": "3.8.0"
  }
]

And this makes sense to me, the Kafka version is 3.8.0, and the connector's versions are 3.8.0. Even though, it is strange to me why FileStream connectors are not listed considering that they are also part of the default Kafka Connect.

Some time later, on a completely different environment I created Kafka Connect instance in order to build image and push it to GHCR. Configuration of that Kafka Connect instance looked like this:

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnect
metadata:
  name: kafka-connect-cluster
  namespace: kafka
annotations:
  strimzi.io/use-connector-resources: "true"
spec:
  version: 3.9.0
  replicas: 1
  bootstrapServers: "kafka-cluster-kafka-bootstrap:9092"
config:
  group.id: connect-cluster
  offset.storage.topic: connect-cluster-offsets
  config.storage.topic: connect-cluster-configs
  status.storage.topic: connect-cluster-status
  config.storage.replication.factor: 1
  offset.storage.replication.factor: 1
  status.storage.replication.factor: 1
build:
  output:
    type: docker
    image: <GHCR Registry>
    pushSecret: ghcr-push-secret
  plugins:
    - name: file-source-connector
      artifacts:
        - type: jar
          url: https://repo1.maven.org/maven2/org/apache/kafka/connect-file/3.6.0/connect-file-3.6.0.jar
  

And the image is successfully created and pushed to the storage.

Then, I wanted to test the image, so I created a testing Kafka Connect instance with this config:

apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnect
metadata:
  name: my-connect-cluster-connect-build
  namespace: first-kafka-namespace
labels:
  my-connect: my-connect-cluster-connect-build
annotations:
  strimzi.io/use-connector-resources: "true"
spec:
  image: <image from GHCR>
  replicas: 1
  bootstrapServers: "first-kafka-cluster-kafka-bootstrap:9092"
config:
  group.id: connect-cluster
  offset.storage.topic: connect-cluster-offsets
  config.storage.topic: connect-cluster-configs
  status.storage.topic: connect-cluster-status
  config.storage.replication.factor: 1
  offset.storage.replication.factor: 1
  status.storage.replication.factor: 1
  template:
    pod:
      imagePullSecrets:
        - name: ghcr-pull-secret

So now when I list connectors I expected to see three mirrormaker connectors with versions 3.9.0, considering that that was the Kafka version of Kafka Connect instance that created the image and I expected to see FileStream connectors of the version 3.6.0 considering that that was the version of the connector's that I specified in the build section.

But what I got was this:

[
  {
    "class": "org.apache.kafka.connect.file.FileStreamSinkConnector",
    "type": "sink",
    "version": "3.9.0"
  },
  {
    "class": "org.apache.kafka.connect.file.FileStreamSourceConnector",
    "type": "source",
    "version": "3.9.0"
  },
  {
    "class": "org.apache.kafka.connect.mirror.MirrorCheckpointConnector",
    "type": "source",
    "version": "3.9.0"
  },
  {
    "class": "org.apache.kafka.connect.mirror.MirrorHeartbeatConnector",
    "type": "source",
    "version": "3.9.0"
  },
  {
   "class": "org.apache.kafka.connect.mirror.MirrorSourceConnector",
   "type": "source",
   "version": "3.9.0"
  }
]

What I don't understand is, why are FileStream connectors of version 3.9.0?

Maybe considering that the Kafka version of the Kafka Connect instance that created the image is 3.9.0, it somehow overrode Filestream connectors of version 3.6.0 that I wanted to put in the image, and instead put those of version 3.9.0.

Could that be the reason?

Also, I don't understand why GET connector-plugins did not give me FileStream connectors in the output the first time, but now it did. I would like to know what I am doing wrong here?


Solution

  • The FileStream source and sync connectors are not part of the Kafka's default classpath. So they will not show up by default. This is different from the Kafka Mirror Maker 2 connectors. That is why they show up while the file connectors don't.

    As the version of the FileStream connector you added, this is because of the way how the connector loads the version -> the FileStream is the Kafka version. But you added it to Kafka 3.9.0. So it will show 3.9.0 as the version. I think there is no reason to use the 3.6.0 connector with Kafka 3.8.0. You should really use the 3.9.0 version. But the connector is listed there because you added it to the container image in spec.build.

    So what you are seeing is from my perspective as expected.