apache-kafka avro confluent-schema-registry

Consumer schema update during deserialization

I'm currently studying the Avro schema system and from what I understand the flow of a schema update is:

1) A client changes the schema (maybe by adding a new field with a default value for backwards compatibility) and sends data to Kafka serialized with the updated schema.

2) Schema registry does compatibility checks and registers the new version of the schema with a new version and a unique Id.

3) The consumer (still using the old schema) attempts to deserialize the data and schema evolution drops the new field, allowing the consumer to deserialize the data.

From I understand we need to explicitly update the consumers after a schema change in order to supply them with the latest schema. But why the consumer just pull the latest schema when it sees that the ID has changed?

Solution

You need to update consumer schemas if they are using a SpecificRecord subclass. That's effectively skipping the schema ID lookup

If you want it to always match the latest, then you can make an http call to the registry to /latest and get it, then restart the app.

Or if you always want the consumer to use the ID of the message, use GenericRecord as the object type