Search code examples
serializationflatbuffers

Flatbuffer wire size larger than expected


I am testing a flatbuffers serialization implementation, but I am seeing a much larger ratio of serialized data size to raw data size. I realize that the protocol is designed to allow backward compatibility and there are alignment considerations that cause some amount of bloating. However, once built, the buffer is approximately 2x the size of the raw data that I am putting into it. That seems large to me, and I am suspicious that it is related to how I have structured my schema. Here is the schema that I would ideally use. It allows for flexibility and makes good logical sense with the type of information that I am trying to represent.

// IDL file

namespace Data;

// Structs \\

struct Position {
  x :short;
  y :short;
  z :short;
}

// Tables \\

table Interaction {
  pos    :Position;
  value  :uint;
}

table Event {
  interactions :[Interaction]; // 1-3 interactions are typical in a given event, but could be as high as 30
  id           :ubyte=255;
  time1        :uint;
  time2        :ulong;
}

table Packet {
  events1 :[Event];       // 1000s or more are typical in a given Packet
  events2 :[OtherEvent1]; // Other events that would be defined but occur much less frequently than events1
  events3 :[OtherEvent2]; // Other events that would be defined but occur much less frequently than events1
}

root_type Packet;

Is this 2x wire size expected based on how I have structured this schema? Is it possibly just inevitable because of the small number of fields in a given table and the large number of elements in the vectors? I have tried to reduce alignment issues by artificially making every variable type the same size (uint), and I have tried bypassing the Interaction table and directly making the Event table have a vector of Position structs (which would take away some of the backward compatibility that I am looking for if I need to make changes in the future). The best I have been able to get the ratio down to is 1.7x. Is that a reasonable amount of extra data?


Solution

  • Yes, there is overhead in alignment, indirect offsets, vtables and a few other things. You're best of reading https://google.github.io/flatbuffers/flatbuffers_internals.html to get an understanding of these, which would help in designing the smallest possible representation.