Search code examples
ruby-on-railsjsonrubyserializationavro

Ruby Avro: Record schema reports datum is not valid


I am trying to serialize a JSON using avro and a schema. This is not a rails app (despite the tag, I needed the attention), so I could also use map however, neither of the following work.

schema = { 'type' => 'record', 'name' => 'test', 'fields' => [ { 'name' => 'field_one', 'type' => 'string' },
                                                               { 'name' => 'field_two', 'type' => 'string'},
                                                               { 'name' => 'field_three', 'type' => 'string'},
                                                               { 'name' => 'field_four', 'type' => 'string' } ] }.to_json
message = {'field_one' => 'why',
             'field_two' => 'this',
             'field_three' => 'no',
             'field_four' => 'worky?'}.to_json
schema = Avro::Schema.parse(schema)
dw = Avro::IO::DatumWriter.new(schema)
buffer = StringIO.new()
encoder = Avro::IO::BinaryEncoder.new(buffer)
dw.write(message, encoder)
puts buffer.read


returns: 'The datum ... is not an example of schema ...'

Replacing schema from above with the following:

schema = { 'type' => 'map', 'values' => 'string' }.to_json

results in: 'undefined method `keys' for StringObject'

Not sure, why I can't create a simple map with string:string key:values. The ruby documentation for for Avro is awful; very few actual example and a lot of placeholder templates which leave a lot to the imagination. I'd like to be able to put this serialized string object into an http request with type application/json, which may be an issue since avro wants to be binary.


Solution

  • require 'avro'
    require 'json'
    
    schema = { 'type' => 'record', 'name' => 'Test', 'fields' => [ { 'name' => 'field_one', 'type' => 'string' },
                                                                   { 'name' => 'field_two', 'type' => 'string'},
                                                                   { 'name' => 'field_three', 'type' => 'string'},
                                                                   { 'name' => 'field_four', 'type' => 'string' } ] }.to_json
    
    message = {'field_one' => 'why',
                 'field_two' => 'this',
                 'field_three' => 'no',
                 'field_four' => 'worky?'}
    
    schema = Avro::Schema.parse(schema)
    writer = Avro::IO::DatumWriter.new(schema)
    buffer = StringIO.new
    writer = Avro::DataFile::Writer.new(buffer, writer, schema)
    writer << message
    writer.close # important
    
    result = buffer.string
    
    puts result