Search code examples
javaapache-kafkadeserializationavro

Nested Avro record deserialization with array type of records


I am writing code to deserialize avro nested records with array type of records in it by using POJO's for array type record and calling as list in main POJO class for deserialization. However I am not understanding how I can use multiple schema for deserialization of the record.

Schema structure:

{
"type": "record",
"name": "MainSchemaName",
"version": "2",
"namespace": "com.cmain",
"doc": "AExample",
"fields": [
 {
  "name": "MainABC",
  "type": {
    "type": "array",
    "items": {
      "name": "ABCarr",
      "type": "record",
      "fields": [
        {
          "name": "prod1",
          "type": "double"
        },
        {
          "name": "prod2",
          "type": "string"
        }
      ]
    }
  }
 },
 {
  "name": "comnsu1",
  "type": "int"
 }
 ]
}

Solution

  • You need to specify a single file for each record as follow:

    ABCarr.avsc:

    {
          "name": "ABCarr",
          "namespace": "com.cmain",
          "type": "record",
          "fields": [
            {
              "name": "prod1",
              "type": "double"
            },
            {
              "name": "prod2",
              "type": "string"
            }
          ]
    }
    

    MainSchemaName.avsc:

    {
    "type": "record",
    "name": "MainSchemaName",
    "version": "2",
    "namespace": "com.cmain",
    "doc": "AExample",
    "fields": [
     {
      "name": "MainABC",
      "type": {
           "type": "array",
           "items": "com.cmain.ABCarr",
           "java-class": "java.util.List"
         }
     },
     {
      "name": "comnsu1",
      "type": "int"
     }
     ]
    }
    

    Then you should configure the avro-maven-plugin to build schemas that are required by others ( ABCarr in your case ) prior. Supposing your avro schema files are located on src/main/resources/schema/avro path you should specify the following to build ABCarr and then use it on MainSchemaName as type:

    <plugin>
        <groupId>org.apache.avro</groupId>
        <artifactId>avro-maven-plugin</artifactId>
        <version>1.9.2</version>
        <executions>
            <execution>
                <phase>generate-sources</phase>
                <goals>
                    <goal>schema</goal>
                </goals>
                <configuration>
                    <sourceDirectory>${project.basedir}/src/main/resources/schema/avro</sourceDirectory>
                    <stringType>String</stringType>
                    <createSetters>true</createSetters>
                    <fieldVisibility>private</fieldVisibility>
                    <imports>
                        <import>${project.basedir}/src/main/resources/schema/avro/ABCarr.avsc</import>
                    </imports>
                </configuration>
            </execution>
        </executions>
    </plugin>
    

    So just import all common schemas on imports option.