apache-kafka avro confluent-schema-registry

What is the value of an Avro Schema Registry?

I have many microservices reading/writing Avro messages in Kafka.

Schemas are great. Avro is great. But is a schema registry really needed? It helps centralize Schemas, yes, but do the microservices really need to query the registry? I don't think so.

Each microservice has a copy of the schema, user.avsc, and an Avro-generated POJO: User extends SpecificRecord. I want a POJO of each Schema for easy manipulation in the code.

Write to Kafka:

byte [] value = user.toByteBuffer().array();
producer.send(new ProducerRecord<>(TOPIC, key, value));

Read from Kafka:

User user = User.fromByteBuffer(ByteBuffer.wrap(record.value()));

Solution

Schema Registry gives you a way for broader set of applications and services to use the data, not just your Java-based microservices.

For example, your microservice streams data to a topic, and you want to send that data to Elasticsearch, or a database. If you've got the Schema Registry you literally hook up Kafka Connect to the topic and it now has the schema and can create the target mapping or table. Without a Schema Registry each consumer of the data has to find out some other way what the schema of the data is.

Taken the other way around too - your microservice wants to access data that's written into a Kafka topic from elsewhere (e.g. with Kafka Connect, or any other producer) - with the Schema Registry you can simply retrieve the schema. Without it you start coupling your microservice development to having to know about where the source data is being produced and its schema.

There's a good talk about this subject here: https://qconnewyork.com/system/files/presentation-slides/qcon_17_-_schemas_and_apis.pdf