Search code examples
apache-kafkaavroconfluent-schema-registry

Consumer schema update during deserialization


I'm currently studying the Avro schema system and from what I understand the flow of a schema update is:

1) A client changes the schema (maybe by adding a new field with a default value for backwards compatibility) and sends data to Kafka serialized with the updated schema.

2) Schema registry does compatibility checks and registers the new version of the schema with a new version and a unique Id.

3) The consumer (still using the old schema) attempts to deserialize the data and schema evolution drops the new field, allowing the consumer to deserialize the data.

From I understand we need to explicitly update the consumers after a schema change in order to supply them with the latest schema. But why the consumer just pull the latest schema when it sees that the ID has changed?


Solution

  • You need to update consumer schemas if they are using a SpecificRecord subclass. That's effectively skipping the schema ID lookup

    If you want it to always match the latest, then you can make an http call to the registry to /latest and get it, then restart the app.

    Or if you always want the consumer to use the ID of the message, use GenericRecord as the object type