Search code examples
c++serializationdeserializationipcflatbuffers

How to use flat buffers when the schema is not fixed?


Current working of my C++ application is as follows:

1. It involves launching another process and uses windows shared memory to communicate between the two processes.
2. The data is serialized in one process and de-serialized in another process. However, the data type could also vary based on the user inputs, and hence the type is also serialized so that deserializer could interpret the data correctly.

Now, I am intending to use flat-buffer to serialize and deserialize data (because of its obvious advantages - random access and backward compatibility).
However, to do that I need clarity in some areas and hoping for some help on them.

  1. Based on the data type, I can programmatically generate schema and feed it to flatc.exe to generate files. However, instead of using flatc.exe, I am thinking to build flatc.dll (from the open source code) and use that to keep the interaction simpler. Does that sound wiser?

  2. Secondly, what I am more unsure is of the following. I will create a schema and invoke 'Flat Buffer compiler' while the application is running. It will generate some C++ files. Now, as much as I understand I would need to build those files somehow and the built binary should be plugged in both serializer and deserializer to serialize and deserialize the actual data- and this is all while the application is running. How do I achieve all this?
    This problem is all stemming from the fact that my application does not have any fixed schema. What is the general approach to using flat buffers when the schema is variable?

I hope I am clear about what I am intending to ask. If not, please let me know. I will be happy to provide more details. Thanks for your answers in advance.


Solution

  • The answer is that you do not want this. While it is feasible, especially runtime generation of C++, compiling it into a DLL and then loading it back into your process is an extremely clumsy way of doing it.

    The data structures of your program must be known at compile time (if it is written in C++), so why can't you define a schema for that just once and compile it ahead of time? Does your program allow the user to "design" data structures at runtime?

    For extremely dynamic use cases such as where the user can create arbitrary objects, I'd recommend FlexBuffers (https://google.github.io/flatbuffers/flexbuffers.html). They can be used inside a FlatBuffer to store "unknown" data, or even as their own serialization format. With these, you can serialize objects whose structure is only known at runtime, they have most of the same efficiency properties of FlatBuffers, and you won't need to bundle a C++ compiler with your program :)

    Best is a combination of the two, where all compile time known data is stored in FlatBuffers, and the remainder in FlexBuffers.