Trying to serialize objects that contain a Map instance in Apache Avro and the string keys of the Map are being deserialized but values are deserialized as class Object.
Able to use a GenericDatumWriter
with a GenericData.Record
instance with the properties copied into it but need to serialize the objects directly without having to copy the Map properties into a temporary object just to serialize it.
public void test1() {
TimeDot dot = new TimeDot();
dot.lat = 12;
dot.lon = 34;
dot.putProperty("id", 1234);
dot.putProperty("s", "foo");
System.out.println("BEFORE: " + dot);
// serialize
ReflectDatumWriter<TimeDot> reflectDatumWriter = new ReflectDatumWriter<>(TimeDot.class);
Schema schema = ReflectData.get().getSchema(TimeDot.class);
ByteArrayOutputStream out = new ByteArrayOutputStream();
DataFileWriter<TimeDot> writer = new DataFileWriter<>(reflectDatumWriter).create(schema, out);
writer.append(dot);
writer.close();
// deserialize
ReflectDatumReader<TimeDot> reflectDatumReader = new ReflectDatumReader<>(TimeDot.class);
ByteArrayInputStream inputStream = new ByteArrayInputStream(out.toByteArray());
DataFileStream<TimeDot> reader = new DataFileStream<>(inputStream, reflectDatumReader);
Object dot2 = reader.next();
reader.close();
System.out.println("AFTER: " + dot2);
}
public static class TimeDot {
Map<String, Object> props = new LinkedHashMap<>();
double lat;
double lon;
public void putProperty(String key, Object value) {
props.put(key, value);
}
public String toString() {
return "lat="+ lat +", lon="+ lon +", props="+props;
}
}
Output:
BEFORE: lat=12.0, lon=34.0, props={id=1234, s=foo}
AFTER: lat=12.0, lon=34.0, props={id=java.lang.Object@2b9627bc, s=java.lang.Object@65e2dbf3}
Next tried to manually create the Schema but that fails to serialize.
Exception in thread "main" java.lang.NullPointerException: in TimeDot in map in java.lang.Object null of java.lang.Object of map in field props of TimeDot
public void test2() throws IOException {
TimeDot dot = new TimeDot();
dot.lat = 12;
dot.lon = 34;
dot.putProperty("id", 1234);
dot.putProperty("s", "foo");
System.out.println(dot);
// create Schema
List<Schema.Field> propFields = new ArrayList<>();
propFields.add(new Schema.Field("id", Schema.create(Schema.Type.INT)));
propFields.add(new Schema.Field("s", Schema.create(Schema.Type.STRING)));
Schema propRecSchema = Schema.createRecord("Object",null,"java.lang",false,propFields);
Schema propSchema = Schema.createMap(propRecSchema);
List<Schema.Field> fields = new ArrayList<>(3);
fields.add(new Schema.Field("lat", Schema.create(Schema.Type.DOUBLE)));
fields.add(new Schema.Field("lon", Schema.create(Schema.Type.DOUBLE)));
fields.add(new Schema.Field("props", propSchema));
Schema schema = Schema.createRecord("TimeDot", null, "", false, fields);
System.out.println("\nschema:\n" + schema);
// serialize
ReflectDatumWriter<TimeDot> reflectDatumWriter = new ReflectDatumWriter<>(TimeDot.class);
ByteArrayOutputStream out = new ByteArrayOutputStream();
DataFileWriter<TimeDot> writer = new DataFileWriter<>(reflectDatumWriter).create(schema, out);
writer.append(dot); // *** fails here > NullPointerException ***
writer.close();
// deserialize
ReflectDatumReader<TimeDot> reader = new ReflectDatumReader<>(schema);
TimeDot dot2 = reader.read(null,
DecoderFactory.get().binaryDecoder(out.toByteArray(), null));
System.out.println(dot2);
}
To serialize an object that contains a Map must define a Union in the Avro schema with the list of all possible types of values.
IMPORTANT: If do not set the namespace correctly then the deserialization returns a GenericData.Record rather than a TimeDot class instance.
List<Schema.Field> fields = new ArrayList<>();
fields.add(new Schema.Field("lat", Schema.create(Schema.Type.DOUBLE)));
fields.add(new Schema.Field("lon", Schema.create(Schema.Type.DOUBLE)));
fields.add(new Schema.Field("props", Schema.createMap(
Schema.createUnion(Arrays.asList(
Schema.create(Schema.Type.INT),
Schema.create(Schema.Type.STRING))))));
Schema schema = Schema.createRecord("TimeDot", null, "TestAvroUnion", false, fields);
TimeDot dot = new TimeDot();
dot.lat = 12;
dot.lon = 34;
dot.putProperty("id", 1234);
dot.putProperty("s", "foo");
System.out.println("BEFORE: " + dot);
// serialize
ReflectDatumWriter<TimeDot> reflectDatumWriter = new ReflectDatumWriter<>(schema);
ByteArrayOutputStream out = new ByteArrayOutputStream();
DataFileWriter<TimeDot> dataWriter = new DataFileWriter<>(reflectDatumWriter);
dataWriter.create(schema, out);
dataWriter.append(dot);
dataWriter.close();
// deserialize
ReflectDatumReader<TimeDot> reflectDatumReader = new ReflectDatumReader<>(schema);
try(
ByteArrayInputStream bis = new ByteArrayInputStream(out.toByteArray());
DataFileStream<TimeDot> reader = new DataFileStream<>(bis, reflectDatumReader)
) {
TimeDot dot2 = reader.next();
System.out.println("AFTER: " + dot2);
}
}
The output is as follows:
BEFORE: lat=12.0, lon=34.0, props={id=1234, s=foo}
AFTER: lat=12.0, lon=34.0, props={id=1234, s=foo}
Alternatively use SchemaBuilder to create the schema:
Schema schema = SchemaBuilder
.record("TimeDot")
.namespace("TestUnion")
.fields()
.name("lat")
.type().doubleType()
.noDefault()
.name("lon")
.type().doubleType()
.noDefault()
.name("props")
.type().map()
.values(SchemaBuilder.unionOf().intType().and().stringType().endUnion())
.noDefault()
.endRecord();