I was playing around with Kafka Connect's functionality to build images, and I have some questions. I used Strimzi operator version 0.45.0 in all examples.
When I create a Strimzi Kafka Connect instance with no image specified by applying this configuration:
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnect
metadata:
name: my-connect-cluster-connect-build
namespace: first-kafka-namespace
labels:
my-connect: my-connect-cluster-connect-build
annotations:
strimzi.io/use-connector-resources: "true"
spec:
version: 3.8.0
replicas: 1
bootstrapServers: "first-kafka-cluster-kafka-bootstrap:9092"
config:
group.id: connect-cluster
offset.storage.topic: connect-cluster-offsets
config.storage.topic: connect-cluster-configs
status.storage.topic: connect-cluster-status
config.storage.replication.factor: 1
offset.storage.replication.factor: 1
status.storage.replication.factor: 1
Connectors that are available are these (by running curl -s http://localhost:8083/connector-plugins | jq . command):
[
{
"class": "org.apache.kafka.connect.mirror.MirrorCheckpointConnector",
"type": "source",
"version": "3.8.0"
},
{
"class": "org.apache.kafka.connect.mirror.MirrorHeartbeatConnector",
"type": "source",
"version": "3.8.0"
},
{
"class": "org.apache.kafka.connect.mirror.MirrorSourceConnector",
"type": "source",
"version": "3.8.0"
}
]
And this makes sense to me, the Kafka version is 3.8.0, and the connector's versions are 3.8.0. Even though, it is strange to me why FileStream connectors are not listed considering that they are also part of the default Kafka Connect.
Some time later, on a completely different environment I created Kafka Connect instance in order to build image and push it to GHCR. Configuration of that Kafka Connect instance looked like this:
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnect
metadata:
name: kafka-connect-cluster
namespace: kafka
annotations:
strimzi.io/use-connector-resources: "true"
spec:
version: 3.9.0
replicas: 1
bootstrapServers: "kafka-cluster-kafka-bootstrap:9092"
config:
group.id: connect-cluster
offset.storage.topic: connect-cluster-offsets
config.storage.topic: connect-cluster-configs
status.storage.topic: connect-cluster-status
config.storage.replication.factor: 1
offset.storage.replication.factor: 1
status.storage.replication.factor: 1
build:
output:
type: docker
image: <GHCR Registry>
pushSecret: ghcr-push-secret
plugins:
- name: file-source-connector
artifacts:
- type: jar
url: https://repo1.maven.org/maven2/org/apache/kafka/connect-file/3.6.0/connect-file-3.6.0.jar
And the image is successfully created and pushed to the storage.
Then, I wanted to test the image, so I created a testing Kafka Connect instance with this config:
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaConnect
metadata:
name: my-connect-cluster-connect-build
namespace: first-kafka-namespace
labels:
my-connect: my-connect-cluster-connect-build
annotations:
strimzi.io/use-connector-resources: "true"
spec:
image: <image from GHCR>
replicas: 1
bootstrapServers: "first-kafka-cluster-kafka-bootstrap:9092"
config:
group.id: connect-cluster
offset.storage.topic: connect-cluster-offsets
config.storage.topic: connect-cluster-configs
status.storage.topic: connect-cluster-status
config.storage.replication.factor: 1
offset.storage.replication.factor: 1
status.storage.replication.factor: 1
template:
pod:
imagePullSecrets:
- name: ghcr-pull-secret
So now when I list connectors I expected to see three mirrormaker connectors with versions 3.9.0, considering that that was the Kafka version of Kafka Connect instance that created the image and I expected to see FileStream connectors of the version 3.6.0 considering that that was the version of the connector's that I specified in the build section.
But what I got was this:
[
{
"class": "org.apache.kafka.connect.file.FileStreamSinkConnector",
"type": "sink",
"version": "3.9.0"
},
{
"class": "org.apache.kafka.connect.file.FileStreamSourceConnector",
"type": "source",
"version": "3.9.0"
},
{
"class": "org.apache.kafka.connect.mirror.MirrorCheckpointConnector",
"type": "source",
"version": "3.9.0"
},
{
"class": "org.apache.kafka.connect.mirror.MirrorHeartbeatConnector",
"type": "source",
"version": "3.9.0"
},
{
"class": "org.apache.kafka.connect.mirror.MirrorSourceConnector",
"type": "source",
"version": "3.9.0"
}
]
What I don't understand is, why are FileStream connectors of version 3.9.0?
Maybe considering that the Kafka version of the Kafka Connect instance that created the image is 3.9.0, it somehow overrode Filestream connectors of version 3.6.0 that I wanted to put in the image, and instead put those of version 3.9.0.
Could that be the reason?
Also, I don't understand why GET connector-plugins did not give me FileStream connectors in the output the first time, but now it did. I would like to know what I am doing wrong here?
The FileStream
source and sync connectors are not part of the Kafka's default classpath. So they will not show up by default. This is different from the Kafka Mirror Maker 2 connectors. That is why they show up while the file connectors don't.
As the version of the FileStream connector you added, this is because of the way how the connector loads the version -> the FileStream
is the Kafka version. But you added it to Kafka 3.9.0. So it will show 3.9.0 as the version. I think there is no reason to use the 3.6.0 connector with Kafka 3.8.0. You should really use the 3.9.0 version. But the connector is listed there because you added it to the container image in spec.build
.
So what you are seeing is from my perspective as expected.