Search code examples
unionsflatbuffers

Is there a way to workaround the limit of 255 types in a flatbuffers union?


I am using flatbuffers to serialize rows from sql tables. I have a Statement.fbs that defines a statement as Insert, Update, Delete, etc. The statement has a member "Row" that is a union of all sql table types. However, I have more than 255 tables and I get this error when compiling with flatc:

$ ~/flatbuffers/flatc --cpp -o gen Statement.fbs
error: /home/jkl/fbtest/allobjects.fbs:773: 18: error: enum value does not fit [0; 255]

I looked through the flatbuffers code and I see that an enum is automatically created for union types and that the underlying type of this enum is uint8_t.

I do not see any options for changing this behavior.

I am able to create an enum that handles all my tables by specifying the underlying type to be uint16 in my flatbuffer schema file.

The statement schema:

include "allobjects.fbs";

namespace Database;

enum StatementKind : byte { Unknown = 0, Insert, Update, Delete, Truncate }

table Statement {
  kind:StatementKind;
  truncate:[TableKind];
  row:Row;
}

root_type Statement;

The allobjects Row union is a bit large to include here.

union Row {
    TypeA,
    TypeB,
    TypeC,
    Etc,
    ...
}

I suppose this is a design decision for flatbuffers that union types should only use one byte. I can accept that, but I would really like a workaround.


Solution

  • This sadly is a bit of a design mistake, and there is no workaround yet. Fixing this to be configurable is possible, but would be a fair bit of work given the amount of language ports that rely on it being a byte. See e.g. here: https://github.com/google/flatbuffers/issues/4209

    Yes, multiple unions is a clumsy workaround.

    An alternative could be to define the type as an enum. Now you have the problem that you don't have a typesafe way to store the table, though. That could be achieved with a "nested flatbuffer", i.e. storing the union value as a vector of bytes, which you can then cheaply call GetRoot on with the correct type, once you inspected the enum.

    Another option may be an enum + a union, if the number of unique kinds of records is < 256. For example, you may have multiple row types that even though they have different names, their contents is just a string, so they can be merged for the union type.

    Another hack could be to have declare a table RowBaseClass {} or whatever, which would be the type of the field, but you would never actually instantiate this table. You then cast back and forth to that type to store the actual table, dependending on the language you're using.