Search code examples
javaamazon-s3avroparquet

how to make avro union multiple types only show ONE type (the one not null)


I have the following avro schema

{
  "type" : "record",
  "namespace" : "com.test.avro",
  "name" : "MultiTypeObject",
  "fields" : [
    {
      "name" : "id",
      "type" : "int"
    },
    {
      "name" : "value",
      "type" : ["null", "double", "int", "boolean"],
      "default" : null
    }
  ]
}

However, when I write the data into S3 using AvroParquet and query in Json format, I got this

"myObject": {
  "id": 1,
  "value": {
    "member2": null,
    "member0": 24.439999999999998,
    "member1": null
  }
}

Question is how can I make this value field just have one value being printed out as below?

"myObject": {
  "id": 1,
  "value": 24.439999999999998
}

Many thanks.


Solution

  • There isn't any way to merge them at least for now and that is how avro works for a union datatype.