From time to time I come across Confluent's Schema Registry and its schema validation functionality.
According to the documentation this works by setting confluent.value.schema.validation
or confluent.key.schema.validation
on a topic to true
. If I now use a tool like kafka-console-producer to send an invalid message, the broker rejects this message.
How does that actually work? After all, kafka-console-producer is part of Apache Kafka and not (directly) related to Confluent.
I'm neither aware of any Kafka plugin API that would support something like that nor could I find anything at Confluent's github page. I guess Confluent Platform somehow hides intermediate topics to enable this, but then I still don't know how they trigger an error in kafka-console-producer.
Server side schema validation feature is only available on Confluent Server, not Apache Kafka, which is a closed source, enterprise build of Kafka.
But the basic idea would be to introduce a feature flag in the source code (Confluent maintains their own Kafka fork), and intercept the network request server-side to deserialize every record before it's written to disk, then verify it against a schema.
don't know how they trigger an error in kafka-console-producer
I personally haven't used the feature, so not sure what error you're referring to, but they could also forge a custom Kafka protocol network response