Search code examples
c#performanceioprotobuf-netconcurrentdictionary

Can I use protobuf-net to serialize and deserialize ConcurrentDictionaries to the same file and then read it?


So I am working on the communication manager for a naval simulator that manages both communications and stores all the data and I need to serialize to file big ConcurrentDictionaries fast and at maximum 10 times per second, the number of times that I need to do this is variable as this is our attempt to implement a replay feature and the number of serializations depends on how long the simulation took. After some research I landed on protobuf-net but I can't figure it out. The objects I need to work with are quite complex classes with a lot of properties, some of which are custom types and inherit from classes that inherit from other classes. The concurrent dictionaries look something like this ConcurrentDictionary<string, MyClass> The only thing I think I got right is decorating the classes with the attributes. I can't figure out the rest.

I need to serialize a lot of times to the same file and then read that file to recreate the ConcurrentDictionaries. I tried different approaches but without any kind of success...I need a direction.

The problem I think lies in the fact that I serialize multiple times to the same file and when I deserialize I just feed the file as a stream to the Serializer.Deserialize method. I think it tries to read the whole file and breaks up...

EDIT: as suggested I added more details reguarding the program.


Solution

  • There is a bit to unpack here.

    To serialize multiple independent objects to the same stream you can use Serializer.SerializeWithLengthPrefix / Serializer.DeserializeWithLengthPrefix. This should allow you to serialize objects one after each other. See ProtoInclude for how to handle inheritance.

    To serialize each object I would tend to prefer to convert the objects into a separate type that is only used for serialization. Sometimes called a Data Transfer Object or DTO. This lets you separate the concerns of serialization from all kinds of domain logic, at the cost of some duplication of code.

    There are a few ways to manage size. One approach is to only change changes to state, not the entire state. Something similar is sometimes used for games, where you only need to record the user input to allow you to replay the entire game. Another approach is compression, since states probably do not change that much. LZ4 claims to be one of the faster algorithms around. You might also want to keep things in memory if possible, since even the fastest SSD is much slower than memory.

    I would highly recommend setting up a simple test environment. I.e. start by serializing a simple object, continue with a complex object, a dictionary of complex objects, and so on. This is also a good opportunity to measure performance.