Search code examples
scalahadooprecordavro

Avro genericdata.Record ignores data types


I have the following avro schema

{ "namespace": "example.avro",
  "type": "record",
  "name": "User",
  "fields": [
            {"name": "name", "type": "string"},
            {"name": "favorite_number",  "type": ["int", "null"]},
            {"name": "favorite_color", "type": ["string", "null"]}
            ]
 }

I use the following snippet to set up a Record

val schema = new Schema.Parser().parse(new File("data/user.avsc"))
val user1 = new GenericData.Record(schema)  //strangely this schema only checks for valid fields NOT types.
user1.put("name", "Fred")
user1.put("favorite_number", "Jones")

I would have thought that this would fail to validate against the schema

When I add the line

user1.put("last_name", 100)

It generates a run time error, which is what I would expect in the first case as well.

Exception in thread "main" org.apache.avro.AvroRuntimeException: Not a valid schema field: last_name at org.apache.avro.generic.GenericData$Record.put(GenericData.java:125) at csv2avro$.main(csv2avro.scala:40) at csv2avro.main(csv2avro.scala)

What's going on here?


Solution

  • It won't fail when adding it into the Record, it will fail when it tries to serialize because it is at that point when it is trying to match the type. As far as I'm aware that is the only place it does type checking.