Search code examples
c#serializationencapsulationdecoupling

Best place for serialisation code. Internal to class being serialised, or external class per format?


I often find myself in a quandary in where to put serialisation code for a class, and was wondering what others' thoughts on the subject were.

Bog standard serialisation is a no brainer. Just decorate the class in question.

My question is more for classes that get serialised over a variety of protocols or to different formats and require some thought/optimisation to the process rather than just blindly serialising decorated properties.

I often feel it's cleaner to keep all code to do with one format in its own class. It also allows you to add more formats just by adding a new class. eg.

class MyClass
{
}

Class JSONWriter
{
    public void Save(MyClass o);
    public MyClass Load();
}

Class BinaryWriter
{
    public void Save(MyClass o);
    public MyClass Load();
}

Class DataBaseSerialiser 
{
    public void Save(MyClass o);
    public MyClass Load();
}

//etc

However, this often means that MyClass has to expose a lot more of its internals to the outside world in order for other classes to serialise effectively. This feels wrong, and goes against encapsulation. There are ways around it. eg in C++ you could make the serialiser a friend, or in C# you could expose certain members as an explicit interface, but it still doesn't feel great.

The other option of course, is to have MyClass know how to serialize itself to/from various formats:

class MyClass
{
    public void LoadFromJSON(Stream stream);
    public void LoadFromBinary(Stream stream);

    public void SaveToJSON(Stream stream);
    public void SaveToBinary(Stream stream);
    //etc
}

This feels more encapsulated and correct, but it couples the formatting to the object. What if some external class knows how to serialise more efficiently because of some context that MyClass doesn't know about? (Maybe a whole bunch of MyClass objects are referencing the same internal object, so an external serialiser could optimise by only serialising that once). Additionally if you want a new format, you have to add support in all your objects, rather than just writing a new class.

Any thoughts? Personally I have used both methods depending on the exact needs of the project, but I just wondered if anyone had some strong reasons for or against a particular method?


Solution

  • The most flexible pattern is to keep the objects lightweight and use separate classes for specific types of serialization.

    Imagine the situation if you were required to add another 3 types of data serialization. Your classes would become quickly bloated with code they do not care about. "Objects should not know how they are consumed"