Search code examples
arraysjsonschemaavrooption-type

In Avro schema, how to wrap the array to an object and put it as a union instead of putting just array as union


I have the following data and I want to wrap the OPTIONAL array as an object and then put in as a union but since I am new to this, I am not sure how to do this.

This is how I have done so far so could someone help me correct the below structure in the expected output. Note that this dataRefs is an optional field and this entire structure may or may not be present.

{
      "name": "dataRefs",
      "default": null,
      "type": ["null", 
        {
          "type": "array",
          "items": {
            "name": "dataRef",
            "type": "record",
            "fields": [
              {
                "name": "dataId",
                "type": "string",
                "avro.java.string": "String"
              },
              {
                "name": "email",
                "type": ["null" ,"string"],
                "avro.java.string": "String"
              },
              {
                "name": "phone",
                "type": ["null" ,"string"],
                "avro.java.string": "String"
              },
              {
                "name": "userName",
                "type": ["null" ,"string"],
                "avro.java.string": "String"
              },
              {
                "name": "addressRef",
                "default": null,
                "type": ["null", {
                  "name": "addressRefRecord",
                  "type": "record",
                  "fields": [
                    {
                      "name": "addrRefId",
                      "type": ["null","string"],
                      "avro.java.string": "String"
                    },
                    {
                      "name": "addrType",
                      "type": ["null","string"],
                      "avro.java.string": "String"
                    },
                    {
                      "name": "addressLine1",
                      "type": ["null","string"],
                      "avro.java.string": "String"
                    },
                    {
                      "name": "addressLine2",
                      "type": ["null","string"],
                      "avro.java.string": "String"
                    },
                    {
                      "name": "city",
                      "type": ["null","string"],
                      "avro.java.string": "String"
                    },
                    {
                      "name": "province",
                      "type": ["null","string"],
                      "avro.java.string": "String"
                    },
                    {
                      "name": "country",
                      "type": ["null","string"],
                      "avro.java.string": "String"
                    },
                    {
                      "name": "postalCode",
                      "type": ["null","string"],
                      "avro.java.string": "String"
                    }  
                  ]
                }
                ]
              }
            ]
          }
        }
      ]
    }

My JSON data that I intend to map to above schema looks like follows:

"dataRefs": [{
          "addressRef": {
            "addrRefId": "0",
            "addrType": "ADDRESS",
            "addressLine1": "DA 81",
            "addressLine2": "",
            "city": "Amsterdam",
            "country": "Netherlands",
            "postalCode": "xxxx LN",
            "province": ""
          },
          "dataId": "0",
          "email": "xyz@abc.com"
        }],


Solution

  • This is how I was able to do so:

    {
          "name": "dataRefs",
          "type": [
            "null",
            {
              "type": "record",
              "name": "dataRefsObject",
              "fields": [
                {
                  "name": "dataRefsArray",
                  "type": {
                    "type": "array",
                    "items": {
                      "name": "dataRef",
                      "type": "record",
                      "fields": [
                        {
                          "name": "dataId",
                          "type": ["null", "string"],
                          "avro.java.string": "String"
                        },
                        {
                          "name": "userName",
                          "type": [
                            "null",
                            "string"
                          ],
                          "avro.java.string": "String"
                        },
      ....