Search code examples
c++boostboost-serialization

boost serialization fails on cyclic restore


I try to implement a serializer for two classes Geometry and Dimension that have a cyclic dependency between them. That means Geometry can have a Dimension and a Dimension knows its Geometry. Also, I have DataModel which contains a vector of geometries and dimensions. My classes look like this:

class IGeometry
{
public:
    virtual ~IGeometry() = default;
};

class IDimension
{
public:
    virtual ~IDimension() = default;
    virtual const std::vector<IGeometry*>& GetGeometries() const = 0;
};

class Dimension : public virtual IDimension
{
public:
    Dimension(std::vector<IGeometry*> f) : geometries{ std::move(f) } {}
    ~Dimension() override = default;

    const std::vector<IGeometry*>& GetGeometries() const override 
    {
        return geometries;
    }

private:
    std::vector<IGeometry*> geometries;
};

class Geometry : public virtual IGeometry
{
public:
    ~Geometry() override {}

    void AddDimension(IDimension* dimension)
    {
        dimensions.emplace_back(dimension);
    }
    const std::vector<IDimension*>& GetDimensions() const { return dimensions; }
private:
    std::vector<IDimension*> dimensions{};
};

struct DataModel
{
    std::vector<IGeometry*> geometries;
    std::vector<IDimension*> dimensions;
};

I am using non-intrusive serialization from boost like this:

BOOST_SERIALIZATION_SPLIT_FREE(Geometry)

BOOST_CLASS_EXPORT(Dimension)
BOOST_CLASS_EXPORT(Geometry)

namespace boost
{
namespace serialization
{

template<class Archive>
void serialize(Archive& ar, IGeometry& g, const unsigned int version){ }

template<class Archive>
void serialize(Archive& ar, IDimension& d, const unsigned int version){ }

template<class Archive>
void save(Archive& ar, const Geometry& g, const unsigned int version)
{
    ar& boost::serialization::base_object<IGeometry>(g);
    ar& g.GetDimensions();
}
template<class Archive>
void load(Archive& ar, Geometry& g, const unsigned int version)
{
    ar& boost::serialization::base_object<IGeometry>(g);

    std::vector<IDimension*> dimensions;
    ar& dimensions;
    for(auto* dimension : dimensions)
    {
        g.AddDimension(dimension);
    }
}

template<class Archive>
void serialize(Archive& ar, Dimension& d, unsigned int version)
{
    ar& boost::serialization::base_object<IDimension>(d);
}

template<class Archive>
void save_construct_data(Archive& ar, const Dimension* t, const unsigned int)
{
    ar& t->GetGeometries();
}

template<class Archive>
void load_construct_data(Archive& ar, Dimension* t, const unsigned int file_version)
{
    std::vector<IGeometry*> foos;
    ar& foos;
    ::new(t)Dimension(foos);
}

template<class Archive>
void serialize(Archive& ar, DataModel& model, const unsigned int version)
{
    ar& model.dimensions & model.geometries ;
}

}
}

void SaveModel(const DataModel& model)
{
    std::ofstream ofs("filename");
    boost::archive::text_oarchive oa(ofs);
    oa << model;
}

void RestoreModel(DataModel& model)
{
    std::ifstream ofs("filename");
    boost::archive::text_iarchive ia(ofs);
    ia >> model;
}

int main()
{
    {
        auto* p0 = new Geometry();
        auto* p1 = new Geometry();

        auto* d2 = new Dimension({ p0, p1 });

        p0->AddDimension(d2);
        p1->AddDimension(d2);

        DataModel model;

        model.geometries.emplace_back(p0);
        model.geometries.emplace_back(p1);
        model.dimensions.emplace_back(d2);

        SaveModel(model);
    }

    DataModel model2;
    RestoreModel(model2);

    return 0;
}

The example in main() fails while trying to restore the model with following exception:

Exception thrown at 0x00007FF7F62BA385 in boost.exe: 0xC0000005: Access violation reading location 0xFFFFFFFFFFFFFFFF.

My guess, while restoring, boost tries to de-serialize the Dimension with 2 geometries in load_construct_data. Then it tries to de-serialize one of the geometries which again has the Dimension as a reference which is of course not constructed currently.

I am new to boost::serialization.

My questions are:

  • Is serialization of my data model like this feasible?
  • Is the general serialization code "correct" (regarding to the interfaces etc.)
  • What m I doing wrong on de-serializing the cyclic dependency?

Without the abstract classes it works fine!


Solution

  • Cyclical references are fine. Your problem appears to be from having virtual base classes.

    In particular, commenting the virtual keyword here:

    class Dimension : public /*virtual*/ IDimension {
    

    makes the problem go away. My only hunch is that somehow the base-object needs to be constructed before the load-construct-data happens.

    Indeed, removing the need for load/save construct data does work even with the virtual base:

    Live On Coliru

    Thinking about it, it makes sense:

    • serializing Dimension serializes the related geometries
    • the geometries are read back in load-construct-data. Because the object has not been constructed at that time, object tracking cannot have happened yet
    • however, deserializing the geometries implies indirectly deserializing their related dimensions, which requires correct object tracking to have been complete for the dimensions. Since this is not the case, the wrong branch will be taken and the "effectively corrupt" stream leads the code to invoke Undefined Behaviour

    Summary

    Two solutions:

    • Drop the (unnecessary?) virtual base class modifiers
    • Use friend access or intrusive serialization to get the related entities serialized