Search code examples
apache-kafkaavroconfluent-schema-registrydebezium

Debezium might produce invalid schema


I face an issue with Avro and Schema Registry. After Debezium created a schema and a topic, I have downloaded the schema from Schema Registry. I put it into a .asvc file and it looks like this:

  {
    "type": "record",
    "name": "Envelope",
    "namespace": "my.namespace",
    "fields": [
      {
        "name": "before",
        "type": [
          "null",
          {
            "type": "record",
            "name": "MyValue",
            "fields": [
              {
                "name": "id",
                "type": "int"
              }
            ]
          }
        ],
        "default": null
      },
      {
        "name": "after",
        "type": [
          "null",
          "MyValue"
        ],
        "default": null
      }
    ]
  }

I ran two experiments:

  1. I tried to put it back into Schema Registry but I get this error: MyValue is not correct. When I remove "after" record, the schema seems to work well.

  2. I used 'generate-sources' from avro-maven-plugin to generate the Java classes. When I try to consume the topic above, I see this error:

    Exception in thread "b2-StreamThread-1" org.apache.kafka.streams.errors.StreamsException: Exception caught in process. [...]: Error registering Avro schema: [...]

    Caused by: io.confluent.kafka.schemaregistry.client.rest.exceptions.RestClientException: Schema being registered is incompatible with an earlier schema; error code: 409

Did anyone faced the same problem? Is it Debezium that is producing an invalid schema or is Schema Registry that is having a bug?


Solution

  • MyValue is not correct.

    That's not an Avro type. You would have to embed the actual record within the union, just like the before value

    In other words, you're not able to cross reference the record types within a schema, AFAIK

    When I try to consume the topic above, I see this error:

    A consumer does not register schemas, so it's not clear how you're getting that error unless maybe using Kafka Streams, which produces into intermediate topics